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To the children in my life 


L'homme [. . .] passe à travers des forêts de symboles 

Qui l’observent avec des regards familiers. 

Comme de longs échos qui de loin se confondent 

Dans une ténébreuse et profonde unité, 

Vaste comme la nuit et comme la clarté 
BAUDELAIRE, Les Fleurs du Mal 
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Introduction: glimpses of the theory beneath 
Monstrous Moonshine 


When you are collecting mushrooms, you only see the mushroom itself. But if 
you are a mycologist, you know that the real mushroom is in the earth. There’s an 
enormous thing down there, and you just see the fruit, the body that you eat. In 
mathematics, the upper part of the mushroom corresponds to theorems that you 
see, but you don’t see the things that are below, that is: problems, conjectures, 
mistakes, ideas, etc. 

V. I. Arnold [17] 


What my experience of mathematical work has taught me again and again, is that 
the proof always springs from the insight, and not the other way around — and 
that the insight itself has its source, first and foremost, in a delicate and obstinate 
feeling of the relevant entities and concepts and their mutual relations. The guiding 
thread is the inner coherence of the image which gradually emerges from the 
mist, as well as its consonance with what is known or foreshadowed from other 
sources — and it guides all the more surely as the ‘exigence’ of coherence is stronger 
and more delicate. 

A. Grothendieck.' 


Interesting events (e.g. wars) always happen whenever different realisations of the same 
thing confront one another. When clarity and precision are added to the mix, we call this 
mathematics. In particular, the most exciting and significant moments in mathematics 
occur when we discover that seemingly unrelated phenomena are shadows cast by the 
same beast. This book studies one who has been recently awakened. 

In 1978, John McKay made an intriguing observation: 196 884 ~ 196 883. Monstrous 
Moonshine is the collection of questions (and a few answers) that it directly inspired. No 
one back then could have guessed the riches to which it would lead. But in actual fact, 
Moonshine (albeit non-Monstrous) really began long ago. 


0.1 Modular functions 
Up to topological equivalence (homeomorphism), every compact surface is uniquely 
specified by its genus: a sphere is genus 0, a torus genus 1, etc. However, a (real) surface 
can be made into a complex curve by giving it more structure. For a sphere, up to 


1 Translated in Geometric Galois Actions 1, edited by L. Schneps et al. (Cambridge, Cambridge University 
Press, 1997) page 285. 


2 Introduction 


complex-analytic equivalence there is only one way to do this, namely the Riemann 
sphere C U {oo}. Surfaces of genus > 0 can be given complex structure in a continuum 
of different ways. 

Any such complex curve & is complex-analytically equivalent to one of the form 
T\H. The upper half-plane 


H := {t € C|Imr > 0} (0.1.1) 


is a model for hyperbolic geometry. Its geometry-preserving maps form the group SL2(R) 
of 2 x 2 real matrices with determinant 1, which act on H by the familiar 


b i) tag (0.1.2) 


c d cr+d` 


T is a discrete subgroup of SL2(R). By H here we mean H with countably many points 
from its boundary R U {ioo} added — these extra boundary points, which depend on T’, 
are needed for T\H to be compact. The construction of the space \H of T -orbits in 
H is completely analogous to that of the circle R/Z or torus R?/Z?. See Section 2.1.1 
below. 

The most important example is r = SL2(Z), because the moduli space of possible 
complex structures on a torus can be naturally identified with SL2(Z)\H. For that I’, as 
well as all other we consider in this book, we have 


H = HU QU {iœ}. (0.1.3) 


These additional boundary points Q U {ioo} are called cusps. 

Both geometry and physics teach us to study a geometric shape through the functions 
(fields) that live on it. The functions f living on © =[\H are simply functions f : 
H — C that are periodic with respect to I’: that is, 


f(A) = f(t), Wee, Aer. (0.1.4) 


They should also preserve the complex-analytic structure of X. Ideally this would mean 
that f should be holomorphic but this is too restrictive, so instead we require meromor- 
phicity (i.e. we permit isolated poles). 


Definition 0.1 A modular function f for some T isameromorphic function f : H —> C, 
obeying the symmetry (0.1.4). 


It is clear then why modular functions must be important: they are the functions living 
on complex curves. In fact, modular functions and their various generalisations hold a 
central position in both classical and modern number theory. 

We can construct some modular functions for T = SL2(Z) as follows. Define the 
(classical) Eisenstein series by 


Gilt) := XO (mt $n)“. (0.1.5) 
into 0) 


The McKay equations 3 


For odd k it identically vanishes. For even k > 2 it converges absolutely, and so defines 
a function holomorphic throughout H. It is easy to see from (0.1.5) that 


at+b Leja k a 
Gk (==) = (cT +d) GCT), ($ 


J € SL2(Z) (0.1.6) 


and all t. This transformation law (0.1.6) means that G; isn’t quite a modular function 
(it’s called a modular form). However, various homogeneous rational functions of these 
Gx will be modular functions for SL2(Z) — for example, Gg(t)/ G4(t)* (which turns out 
to be constant) and G4(t)? / Go6(t)* (which doesn’t). All modular functions of SL2(Z) 
turn out to arise in this way. 

Can we characterise all modular functions, for r = SL2(Z) say? We know that any 
modular function is a meromorphic function on the compact surface © = SL2(Z)\H. 
As we explain in Section 2.2.4, © is in fact a sphere. It may seem that we’ve worked 
very hard merely to recover the complex plane C = SL2(Z)\H and its familiar com- 
pactification the Riemann sphere P!(C) = C U {oo} = SL2(Z)\H, but that’s exactly the 
point! 

Although there are large numbers of meromorphic functions on the complex plane 
C, the only ones that are also meromorphic at oo — the only functions meromorphic 
on the Riemann sphere P!(C) — are the rational functions pooma nz (the others have 
essential singularities there). So if J is achange-of-coordinates (or uniformising) function 
identifying our surface & with the Riemann sphere, then J (lifted to a function on the 
covering space H) will be a modular function for SL»(Z), and any modular function f(T) 
will be a rational function in J (tT): 


polynomial in J (t) 


f@m= (0.1.7) 


polynomial in J (t)` 
Conversely, any rational function (0.1.7) in J is modular. Thus J generates modular 
functions for SL2(Z), in a way analogous to (but stronger and simpler than) how the 
exponential e(x) = e?”'* generates the period-1 smooth functions f on R: we can always 


expand such an f in the pointwise-convergent Fourier series f(x) = $>? 4 dn e(x)". 
There is a standard historical choice j for this uniformisation J, namely 
20G4(t) 
j(t) = 1728 ie) 
20 G4(T} — 49 Go(t)? 
= q7! +744 + 196 884q + 21493 760q7 + 864299970qg? +--- (0.1.8) 


where q = exp[2zriT]. In fact, this choice (0.1.8) is canonical, apart from the arbitrary 
constant 744. This function j is called the absolute invariant or Hauptmodul for SL2(Z), 
or simply the j-function. 


0.2 The McKay equations 


In any case, one of the best-studied functions of classical number theory is the j- 
function. However, its most remarkable property was discovered only recently: McKay’s 


4 Introduction 


approximations 196884 ~ 196883, 21493760 ~ 21296876 and 864299970 ~ 
842 609 326. In fact, 


196 884 = 196883 + 1, (0.2.1a) 
21493 760 = 21 296 876 + 196 883 + 1, (0.2.1b) 
864 299 970 = 842 609 326 + 21 296 876 + 2-196883+2-1. (0.2.1c) 


The numbers on the left sides of (0.2.1) are the first few coefficients of the j-function. 
The numbers on the right are the dimensions of the smallest irreducible representations 
of Fischer-Griess’s Monster finite simple group M. 

A representation of a group G is the assignment of a matrix R(g) to each element g of 
G in such a way that the matrix product respects the group product, that is R(g) R(h) = 
R(gh). The dimension of a representation is the size n of its n x n matrices R(g). 

The finite simple groups are to finite groups what the primes are to integers — they 
are their elementary building blocks (Section 1.1.2). They have been classified (see 
[22] for recent remarks on the status of this proof). The resulting list consists of 18 
infinite families (e.g. the cyclic groups Z, := Z/pZ of prime order), together with 26 
exceptional groups. The Monster M is the largest and richest of these exceptionals, with 
order 


IM] = 27° . 3% . 5° . 76 . 117. 1133-17-19. 23 - 29 - 31 - 41 - 47 -59 -71 ~ 8 x 10°°. 
(0.2.2) 


Group theorists would like to believe that the classification of finite simple groups is 
one of the high points in the history of mathematics. But isn’t it possible instead that 
their enormous effort has merely culminated in a list of interest only to a handful of 
experts? Years from now, could the Monster — the signature item of this list — become 
a lost bone in a dusty drawer of a forgotten museum, remarkable only for its colossal 
irrelevance? 

With numbers so large, it seemed doubtful to McKay that the numerology (0.2.1) 
was merely coincidental. Nevertheless, it was difficult to imagine any deep conceptual 
relation between the Monster and the j-function: mathematically, they live in different 
worlds. 

In November 1978 he mailed the ‘McKay equation’ (0.2.1a) to John Thompson. At 
first Thompson likened this exercise to reading tea leaves, but after checking the next 
few coefficients he changed his mind. He then added a vital piece to the puzzle. 


0.3 Twisted #0: the Thompson trick 
A nonnegative integer begs interpretation as the dimension of some vector space. Essen- 
tially, that was what McKay proposed. Let po, p1, . . . be the irreducible representations 
of M, ordered by dimension. Then the equations (0.2.1) are really hinting that there is 
an infinite-dimensional graded representation 


V=V P Vn PVE Ve. (0.3.1) 
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of M, where V_; = po, Vi = p1 ® Po, V2 = m B P1 ® fro, V3 = 03 ® m2 B p1 B P1 È 
Po ® Po, etc., and the j-function is essentially its graded dimension: 


ioe) 
j(t) — 744 = dim(V_))q7! + È dim(Vj) q'. (0.3.2) 
i=l 
Thompson [525] suggested that we twist this, that is more generally we consider what 
we now call the McKay-Thompson series 


[o0] 
T(t) = chy_,(g)q7' + È chy, (&)q' (0.3.3) 

i=l 
for each element g € M. The character ‘ch,’ of a representation p is given by ‘trace’: 
ch,(g) = tr(p(g)). Up to equivalence (i.e. choice of basis), a representation p can be 
recovered from its character ch,. The character, however, is much simpler. For exam- 
ple, the smallest nontrivial representation of the Monster M is given by almost 1054 
complex matrices, each of size 196 883 x 196 883, while the corresponding character is 
completely specified by 194 integers (194 being the number of ‘conjugacy classes’ in M). 
For any representation p, the character value ch, (id.) equals the dimension of p, and 
so T;a(t) = j(t) — 744 and we recover (0.2.1) as special cases. But there are many 
other possible choices of g € M, although conjugate elements g, hgh~' have identical 
character values and hence have identical McKay—Thompson series T = Tpgy-1. In fact, 
there are precisely 171 distinct functions T,. Thompson didn’t guess what these functions 

T, would be, but he suggested that they too might be interesting. 


0.4 Monstrous Moonshine 


John Conway and Simon Norton [111] did precisely what Thompson asked. Conway 
called it ‘one of the most exciting moments in my life’ [107] when he opened Jacobi’s 
foundational (but 150-year-old!) book on elliptic and modular functions and found that 
the first few terms of each McKay—Thompson series T, coincided with the first few 
terms of certain special functions, namely the Hauptmoduls of various genus-0 groups 
T. Monstrous Moonshine — which conjectured that the McKay—Thompson series were 
those Hauptmoduls — was officially born. 

We should explain those terms. When the surface I'\H is a sphere, we call the group T 
genus 0, and the (appropriately normalised) change-of-coordinates function from T\H 
to the Riemann sphere C U {oo} the Hauptmodul for Ir. All modular functions for a 
genus-O group F are rational functions of this Hauptmodul. (On the other hand, when T 
has positive genus, two generators are needed, and there’s no canonical choice for them.) 

The word ‘moonshine’ here is English slang for ‘insubstantial or unreal’, ‘idle talk 


or speculation’,” ‘an illusive shadow’.? It was chosen by Conway to convey as well the 


2 Ernest Rutherford (1937): ‘The energy produced by the breaking down of the atom is a very poor kind of 
thing. Anyone who expects a source of power from the transformation of these atoms is talking 
moonshine.’ (quoted in The Wordsworth Book of Humorous Quotations, Wordsworth Editions, 1998). 

3 Dictionary of Archaic Words, J. O. Halliwell, London, Bracken Books, 1987. It also defines moonshine as 
‘a dish composed partly of eggs’, but that probably has less to do with Conway’s choice of word. 
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impression that things here are dimly lit, and that Conway and Norton were ‘distilling 
information illegally’ from the Monster character table. 

In hindsight, the first incarnation of Monstrous Moonshine goes back to Andrew Ogg 
in 1975. He was in France discussing his result that the primes p for which the group 
To(p)+ has genus 0, are 


p € {2, 3,5, 7, 11, 13, 17, 19, 23, 29, 31, 41, 47, 59, 71}. 


(The group I'o(p)+ is defined in (7.1.5).) He also attended there a lecture by Jacques 
Tits, who was describing a newly conjectured simple group. When Tits wrote down the 
order (0.2.2) of that group, Ogg noticed its prime factors coincided with his list of primes. 
Presumably as a joke, he offered a bottle of Jack Daniels whisky to the first person to 
explain the coincidence (he still hasn’t paid up). We now know that each of Ogg’s groups 
To(p)+ is the genus-0 modular group for the function T,, for some element g € M of 
order p. Although we now realise why the Monster’s primes must be a subset of Ogg’s, 
probably there is no deep reason why Ogg’s list couldn’t have been longer. 

The appeal of Monstrous Moonshine lies in its mysteriousness: it unexpectedly asso- 
ciates various special modular functions with the Monster, even though modular functions 
and elements of M are conceptually incommensurable. Now, ‘understanding’ something 
means to embed it naturally into a broader context. Why is the sky blue? Because of the 
way light scatters in gases. Why does light scatter in gases the way it does? Because 
of Maxwell’s equations. In order to understand Monstrous Moonshine, to resolve the 
mystery, we should search for similar phenomena, and fit them all into the same story. 


0.5 The Moonshine of Eg and the Leech 


McKay had also remarked in 1978 that similar numerology to (0.2.1) holds if M and 
j(t) are replaced with the Lie group E'(C) and 


j =q (1 + 248g +4124? + 34752q3 +---). (0.5.1) 


In particular, 4124 = 3875 + 248+ 1 and 34752 = 30380 + 3875 + 2-248 + 1, 
where 248, 3875 and 30 380 are all dimensions of irreducible representations of E(C). 
A Lie group is a manifold with compatible group structure; the groups of Eg type play 
the same role in Lie theory that the Monster does for finite groups. Incidentally, j3 is 
the Hauptmodul of the genus-0 group T (3) (see (2.2.4a)). 

A more elementary observation concerns the Leech lattice. A lattice is a discrete 
periodic set L in R”, and the Leech lattice A is a particularly special one in 24 dimensions. 
196 560, the number of vectors in the Leech lattice with length-squared 4, is also close 
to 196 884: in fact, 


196 884 = 196 560 + 324- 1, (0.5.2a) 
21493 760 = 16773 120 + 24 - 196 560 + 3200 - 1, (0.5.2b) 
864 299 970 = 398 034 000 + 24 - 16773 120 + 324 - 196 560 + 25650- 1, (0.5.2c) 
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where 16773 120 and 398 034 000 are the numbers of length-squared 6- and 8-vectors 
in the Leech. This may not seem as convincing as (0.2.1), but the same equations hold 
for any of the 24-dimensional even self-dual lattices, apart from an extra term on the 
right sides corresponding to length-squared 2 vectors (there are none of these in the 
Leech). 

What conceptually does the Monster, Eg and the Leech lattice have to do with the 
j-function? Is there a common theory explaining this numerology? The answer is yes! 

It isn’t difficult to relate Eg to the j-function. In the late 1960s, Victor Kac [325] and 
Robert Moody [430] independently (and for entirely different reasons) defined a new 
class of infinite-dimensional Lie algebras. A Lie algebra is a vector space with a bilinear 
vector-valued product that is both anti-commutative and anti-associative (Section 1.4.1). 
The familiar vector-product u x v in three dimensions defines a Lie algebra, called 
slo, and in fact this algebra generates all Kac—Moody algebras. Within a decade it was 
realised that the graded dimensions of representations of the affine Kac—Moody algebras 
are (vector-valued) modular functions for SL2(Z) (Theorem 3.2.3). 

Shortly after McKay’s Eg observation, Kac [326] and James Lepowsky [373] inde- 
pendently remarked that the unique level-1 highest-weight representation L(@ 9) of the 
affine Kac-Moody algebra Eg™” has graded dimension j (q): . Since each homogeneous 
piece of any representation L(A) of the affine Kac-Moody algebra X¿™ must carry 
a representation of the associated finite-dimensional Lie group X¢(C), and the graded 
dimensions (multiplied by an appropriate power of q) of an affine algebra are modular 
functions for some I’ C SL2(Z), this explained McKay’s Eg observation. His Monster 
observations took longer to clarify because so much of the mathematics needed was still 
to be developed. 

Euler played with a function t(x) := 1 + 2x + 2x4 + 2x? + 2x16 +- --, because it 
counts the ways a given number can be written as a sum of squares of integers. In his study 
of elliptic integrals, Jacobi (and Gauss before him) noticed that if we change variables by 
x =e’ then the resulting function 63(T) := 1 + 2e7* + 2e4tt +... behaves nicely 
with respect to certain transformations of t — we say today that Jacobi’s theta function 
63 is a modular form of weight 5 for a certain index-3 subgroup of SL2(Z). More 
generally, something similar holds when we replace Z with any other lattice L: the theta 
series 


@,(t) := yoo 


neL 


is also a modular form, provided all length-squares n - n are rational. In particular, we 
obtain quite quickly that the theta series of the Leech lattice, divided by Ramanujan’s 
modular form A(t), will equal J (t) + 24. 

For both Eg and the Leech, the j-function arises from a uniqueness property (L (wọ) is 
the only ‘level-1’ Eg”-module; the Leech lattice A is self-dual), together with the empir- 
ical observation that SL2(Z) has few modular forms of small level. In these examples, 
the appearance of the j-function isn’t as significant as that of modularity. 
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Monster, lattices, affine algebras, ... Hauptmoduls, theta functions, ... 


Fig. 0.1 Moonshine in its broader sense. 


0.6 Moonshine beyond the Monster 


We’ve known for many years that lattices (quadratic forms) and Kac—Moody algebras are 
related to modular forms and functions. But these observations, albeit now familiar, are 
also a little mysterious, we should confess. For instance, compare the non-obvious fact 
that 63(—1/t) = J 593(t) with the trivial observation (0.1.6) that G,(—1/t) = t*G;(t) 
for the Eisenstein series G. The modularity of Gg is a special case of the elementary 
observation that SL,,(Z) parametrises the change-of-bases of n-dimensional lattices. The 
modularity of 03, on the other hand, begs a conceptual explanation (indeed, see the quote 
by Weil at the beginning of Section 2.4.2), even though its /ogical explanation (i.e. proof) 
is a quick calculation from, for example, the Poisson summation formula (Section 2.2.3). 
Moonshine really began with Jacobi and Gauss. 


Moonshine should be regarded as a certain collection of related examples where 
algebraic structures have been associated with automorphic functions or forms. 


Grappling with that thought is the theme of our book. Chapters | to 6 could be (rather 
narrowly) regarded as supplying a context for Monstrous Moonshine, on which we focus 
in Chapter 7. From this larger perspective, illustrated in Figure 0.1, what is special about 
this single instance called Monstrous Moonshine is that the several associated modular 
functions are all of a special class (namely Hauptmoduls). 

The first major step in the proof of Monstrous Moonshine was accomplished in the mid- 
1980s with the construction by Frenkel-Lepowsky—Meurman [200] of the Moonshine 
module V and its interpretation by Richard Borcherds [68] as a vertex operator algebra. 
A vertex operator algebra (VOA) is an infinite-dimensional vector space with infinitely 
many heavily constrained vector-valued bilinear products (Chapter 5). It is a natural, 
though extremely intricate, extension of the notion of a Lie algebra. Any algebra A can 
be interpreted as an assignment of a linear map A ®--- & A —> A to each binary tree; 
from this perspective a VOA Y associates a linear map V ®---@ V > V with each 
‘inflated’ binary tree, that is each sphere with discs removed. 

In 1992 Borcherds [72] completed the proof of the original Monstrous Moonshine 
conjectures* by showing that the graded characters T, of V’ are indeed the Hauptmoduls 
identified by Conway and Norton, and hence that V’ is indeed the desired representation 


4 As we see in Chapter 7, most Moonshine conjectures involving the Monster are still open. 
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Fig. 0.2 The algebraic meaning of Moonshine. 


V ofM conjectured by McKay and Thompson. The explanation of Moonshine suggested 
by this picture is given in Figure 0.2. The algebraic structure typically arises as the 
symmetry group of the associated VOA — for example, that of V’ is the Monster M. 
By Zhu’s Theorem (Theorem 5.3.8), the modular forms/functions appear as graded 
dimensions of the (possibly twisted) modules of the VOA. In particular, the answer this 
framework provides for what M, Eg and the Leech have to do with j is that they each 
correspond to a VOA with a single simple module; their relation to j is then an immediate 
corollary to the much more general Zhu’s Theorem. 

It must be emphasised that Figure 0.2 is primarily meant to address Moonshine in 
the broader sense of Figure 0.1, so certain special features of, for example, Monstrous 
Moonshine (in particular that all the T, are Hauptmoduls) are more subtle and have 
to be treated by special arguments. These are quite fascinating by themselves, and are 
discussed in Chapter 7. Even so, Figure 0.2 provides a major clue: 


If you're trying to understand a seemingly mysterious occurrence of the Monster, 
try replacing the word ‘Monster’ with its synonym ‘the automorphism group of the 
vertex operator algebra V”. 


This places the Monster into a much richer algebraic context, with numerous connections 
with other areas of mathematics. 


0.7 Physics and Moonshine 


Moonshine is profoundly connected with physics (namely conformal field theory and 
string theory). String theory proposes that the elementary particles (electrons, photons, 
quarks, etc.) are vibrational modes on a string of length about 10733 cm. These strings 
can interact only by splitting apart or joining together — as they evolve through time, 
these (classical) strings will trace out a surface called the world-sheet. Quantum field 
theory tells us that the quantum quantities of interest (amplitudes) can be perturbatively 
computed as weighted averages taken over spaces of these world-sheets. Conformally 
equivalent world-sheets should be identified, so we are led to interpret amplitudes as 
certain integrals over moduli spaces of surfaces. This approach to string theory leads 
to a conformally invariant quantum field theory on two-dimensional space-time, called 
conformal field theory (CFT). The various modular forms and functions arising in Moon- 
shine appear as integrands in some of these genus-1 (‘1-loop’) amplitudes: hence their 
modularity is manifest. 
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Fig. 0.3 The stringy picture of Moonshine. 


Many aspects of Moonshine make complete sense within CFT, something which helps 
make the words of Freeman Dyson ring prophetic: 


I have a sneaking hope, a hope unsupported by any facts or any evidence, that 
sometime in the twenty-first century physicists will stumble upon the Monster 
group, built in some unsuspected way into the structure of the universe [167]. 


All that said, here we are, sometime in the twenty-first century, and alas the Monster 
still plays at best a peripheral role in physics. And some aspects of Moonshine (e.g. the 
Hauptmodul property) remain obscure in CFT. In any case, although this is primarily 
a mathematics book, we often sit in chairs warmed by physicists. In particular, CFT 
(or what is essentially the same thing, perturbative string theory) is, at least in part, a 
machine for producing modular functions. Here, Figure 0.2 becomes Figure 0.3. More 
precisely, the algebraic structure is an underlying symmetry of the CFT, and its graded 
dimensions are the various modular functions. VOAs can be regarded as an algebraic 
abstraction of CFT, since they arise quite naturally by applying the Wightman axioms 
of quantum field theory to CFT. The lattice theta functions come from bosonic strings 
living on the torus R”/L. The affine Kac—-Moody characters arise when the strings live 
on a Lie group. And the Monster is the symmetry of a string theory for a Z2-orbifold of 
free bosons compactified on the Leech lattice torus R7*/A. 

Physics reduces Moonshine to a duality between two different pictures of quantum field 
theory: the Hamiltonian one, which concretely gives us from representation theory the 
graded vector spaces, and another, due to Feynman, which manifestly gives us modularity. 
In particular, physics tells us that this modularity is a topological effect, and the group 
SL»(Z) directly arises in its familiar role as the modular group of the torus. 

Historically speaking, Figure 0.3 preceded and profoundly affected Figure 0.2. One 
reason the stringy picture is exciting is that the CFT machine in Figure 0.3 outputs much 
more than modular functions — it creates automorphic functions and forms for the various 
mapping class groups of surfaces with punctures. And all this is still poorly explored. 
We can thus expect more from Moonshine than Figure 0.2 alone suggests. On the other 
hand, once again Figure 0.3 can directly explain only the broader aspects of Moonshine. 


5 Curiously, although nonperturbative string theory should be physically more profound, it is the perturbative 
calculations that are most relevant to the mathematics of Moonshine. 
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0.8 Braided #0: the meaning of Moonshine 


In spite of the work of Borcherds and others, the special features of Monstrous Moon- 
shine still beg questions. The full conceptual relationship between the Monster and 
Hauptmoduls (like j) arguably remains ‘dimly lit’, although much progress has been 
realised. This is a subject where it is much easier to speculate than to prove, and we are 
still awash in unresolved conjectures. But most important, we need a second indepen- 
dent proof of Monstrous Moonshine. In order to clarify the still murky significance of 
the Monster in Moonshine, we need to understand to what extent Monstrous Moonshine 
determines the Monster. More generally, we need to go beneath the algebraic explanation 
of Moonshine in order to find its more fundamental meaning, which is probably topolog- 
ical. Explaining something (Moonshine in this case) with something more complicated 
(CFT or VOAs here) cannot be the end of the story. Surely it is instead a beginning. 

To Poincaré 125 years ago, modularity arose through the monodromy of differential 
equations. Remarkably, today CFT provides a similar explanation, although the rele- 
vant partial differential equations are much more complicated. The monodromy group 
here is the braid group 63, and the modular group SL2(Z) arises as a homomorphic 
image. 

Today we are taught to lift modular forms for SL} (Z) to the space L?(SL3(Z)\SL»(R)), 
which carries a representation of the Lie group SL2(R). However, SL2(R) is not simply 


connected; its universal cover SL2(R) is a central extension by Z, and the corresponding 
central extension of SL2(Z) — the fundamental group of SL2(Z)\SL2(R) — is the braid 
group 43. By all rights, these central extensions should be more fundamental. Indeed, 
modular forms of fractional weight, such as the Dedekind eta, certainly see 63 more 
directly than they do SL2(Z) (Section 2.4.3). Similar comments hold for other F — for 
example, the congruence subgroup T (2) lifts to the pure braid group P3. 

The best approach we know for relating the Monster and the Hauptmodul property is 
Norton’s action of B3 on G x G. This associates a genus-0 property with ‘6-transposition 
groups’, which in turn points to a special role for M, as the Monster is expected to be 
essentially the largest such group (Section 7.3.3). Incidentally, the number ‘6’ arises here 
because the principal congruence subgroup F(N) is genus 0 iff N < 6. 

For these reasons and others we explore on the following pages, we expect a new 
proof for Moonshine to involve the braid group 63. The modular groups SL2(Z) and 
PSL»(Z) arise only indirectly as quotients. We also identify other promising places 
to look for alternate arguments for Moonshine — for example, the partial differential 
equations of CFT are built from the heat kernel, which has a long historical association 
with modularity. 


0.9 The book 


Borcherds’ paper [72] and the resulting Fields medal close the opening chapter of the 
story of Moonshine. Now, 25 years after its formulation in [111], we are in a period of 
consolidation and synthesis, flames fanned I hope by this book. 
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Most of us might liken much of our research to climbing a steep hill against a stiff 
breeze: every so often we stumble and roll to the bottom, but with persistence we eventu- 
ally reach the summit and plant our flag amongst the others already there. And before our 
bruises fade and bones mend, we’re off to the next hill. But perhaps research in its purest 
form is more like chasing squirrels. As soon as you spot one and leap towards it, it darts 
away, zigging and zagging, always just out of reach. If you’re a little lucky, you might 
stick with it long enough to see it climb a tree. You'll never catch the damned squirrel, 
but chasing it will lead you to a tree. In mathematics, the trees are called theorems. The 
squirrels are those nagging little mysteries we write at the top of many sheets of paper. 
We never know where our question will take us, but if we stick with it, it’ll lead us to a 
theorem. That I think is what research ideally is like. There is no higher example of this 
than Moonshine. 

This book addresses the theory of the blob of Figure 0.1. We explore some of its 
versatility in Chapter 6, where we glimpse Moonshine orthogonal to the Monster. Like 
moonlight itself, Monstrous Moonshine is an indirect phenomenon. Just as in the theory 
of moonlight one must introduce the sun, so in the theory of Moonshine one must go well 
beyond the Monster. Much as a book discussing moonlight may include paragraphs on 
sunsets or comet tails, so do we discuss fusion rings, Galois actions and knot invariants. 
The following chapters use Moonshine (Monstrous and otherwise) as a happy excuse to 
take a rather winding little tour through modern mathematics and physics. If we offer 
more questions and suggestions than theorems and answers, at least that is in Moonshine’s 
spirit. 

This is not a textbook. The thought bobbing above my head like a balloon while 
writing was that the brain is driven by the qualitative — at the deepest level those are the 
only truths we seek and can absorb. I’m trying to share with the reader my understanding 
(such as it is) of several remarkable topics that fit loosely together under the motley 
banner Moonshine. I hope it fills a gap in the literature, by focusing more on the ideas 
and less on the technical minutiae, important though they are. But even if not, it was a 
pleasure to write, and I think that comes across on every page. 

This book is philosophic and speculative, because Moonshine is. It is written for both 
physicists and mathematicians, because both subjects have contributed to the theory. 
Partly for this reason, this book differs from other mathematics books in the lack of 
formal arguments, and differs from other physics books in the lack of long formulae. 
Without doubt this will froth many mouths. Because the potential readership for this 
story is unusually diverse, I have tried to assume minimal formal background. Hence 
when you come to shockingly trivial passages or abrasively uninteresting tangents, please 
realise they weren’t written for you. 

In modern mathematics there is a strong tendency towards formulations of concepts 
that minimise the number and significance of arbitrary choices. This crispness tends to 
emphasise the naturality of the construction or definition, at the expense sometimes of 
accessibility. Our mathematics is more conceptual today — more beautiful perhaps — but 
the cost of less explicitness is the compartmentalism that curses our discipline. We have 
cut ourselves off not only from each other, but also from our past. In this book I’ve tried 
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to balance this asceticism with accessibility. Some things have surely been lost, but some 
perhaps have been gained. 

The book endures some glaring and painful omissions, due mostly to fear of spousal 
reprisals were I to miss yet another deadline. I hope for a second edition. In it I would 
include a gentle introduction to geometric Langlands. I’d correct the total disregard here 
for all things supersymmetric — after all, most of the geometric impact of string theory 
involves supersymmetry. The mathematical treatment of CFT in Chapter 4 is sparser than 
Pd like. Section 5.4 was originally planned to include brief reviews of the chiral algebras 
of Beilinson—Drinfel’d [48] and the coordinate-free approach to VOAs developed in 
[197]. Cohomological issues arise in every chapter, where they are nonetheless quietly 
ignored. The lip-service paid to subfactors does no justice to their beautiful role in the 
theory. 

I will probably be embarrassed five years from now as to what today I feel is important. 
But at worst I'll be surprised five years from now at what today I find interesting. The 
topics were selected based on my present interests. Other authors (and even me five years 
from now) would make different choices, but for that I won’t apologise. 

So let the chase begin. . . 


1 


Classical algebra 


In this chapter we sketch the basic material — primarily algebra — needed in later chapters. 
As mentioned in the Introduction, the aspiration of this book isn’t to “Textbookhood’. 
There are plenty of good textbooks on the material of this chapter (e.g. [162]). What is 
harder to find are books that describe the ideas beneath and the context behind the various 
definitions, theorems and proofs. This book, and this chapter, aspire to that. What we 
lose in depth and detail, we hope to gain in breadth and conceptual content. The range 
of readers in mind is diverse, from mathematicians expert in other areas to physicists, 
and the chosen topics, examples and explanations try to reflect this range. 

Finite groups (Section 1.1) and lattices (Section 1.2.1) appear as elementary examples 
throughout the book. Lie algebras (Section 1.4), more than their nonlinear partners the 
Lie groups, are fundamental to us, especially through their representations (Section 1.5). 
Functional analysis (Section 1.3), category theory (Section 1.6) and algebraic number 
theory (Section 1.7) play only secondary roles. Section 1.2 provides some background 
geometry, but for proper treatments consult [113], [104], [527], [59], [478]. 

Note the remarkable unity of algebra. Algebraists look at mathematics and science and 
see structure; they study form rather than content. The foundations of a new theory are 
laid by running through a fixed list of questions; only later, as the personality quirks of 
the new structure become clearer, does the theory become more individual. For instance, 
among the first questions asked are: What does ‘finite’ mean here? and What plays the 
role of a prime number? Mathematics (like any subject) evolves by asking questions, and 
though a good original question thunders like lightning at night, it is as rare as genius 
itself. See the beautiful book [504] for more of algebra presented in this style. 


1.1 Discrete groups and their representations 


The notion of a group originated essentially in the nineteenth century with Galois, who 
also introduced normal subgroups and their quotients G/N, all in the context of what we 
now call Galois theory (Section 1.7.2). According to Poincaré, when all of mathematics 
is stripped of its contents and reduced to pure form, the result is group theory.! Groups 
are the devices that act, which explains their fundamental role in mathematics. In physics 
like much of Moonshine, groups arise through their representations. Standard references 
for representation theory are [308], [219]; gentle introductions to various aspects of 
group theory are [162], [421] (the latter is especially appropriate for physicists). 


1 See page 499 of J.-P. Serre, Notices Amer. Math. Soc. (May 2004). 
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1.1.1 Basic definitions 


A group is a set G with an associative product gg’ and an identity e, such that each 
element g € G has an inverse g~!. The number of elements ||G || of a group is called its 
order, and is commonly denoted |G]. 

If we’re interested in groups, then we’re interested in comparing groups, that is we’re 
interested in functions g :G — H that respect group structure. What this means is g 
takes products in G to products in H, the identity eg in G to the identity ey in H, and 
the inverse in G to the inverse in H (the last two conditions are redundant). Such o are 
called group homomorphisms. 

Two groups G, H are considered equivalent or isomorphic, written G = H, if as far 
as the essential group properties are concerned (think ‘form’ and not ‘content’), the two 
groups are indistinguishable. That is, there is a group homomorphism g : G —> H that 
is a bijection (so g7! l! : H — G is itself a group homomorphism (this 
last condition is redundant). An automorphism (or symmetry) of G is an isomorphism 


exists) and g~ 


G — G; the set Aut G of all automorphisms of G forms a group. 

For example, consider the cyclic group Z, = {[0], [1], ..., [n — 1]} consisting of the 
integers taken mod n, with group operation addition. Write U;(C) for the group of 
complex numbers with modulus 1, with group operation multiplication. Then g({a]) = 
e7ia/" defines a homomorphism between Z, and U;(C). The group of positive real 
numbers under multiplication is isomorphic to the group of real numbers under addition, 
the isomorphism being given by logarithm — as far as their group structure is concerned, 


they are identical. Aut Z = Z2, corresponding to multiplying the integers by +1, while 
Aut Z, is the multiplicative group Z*, consisting of all numbers 1 < £ < n coprime to 
n (Le. gcd(€, n) = 1), with the operation being multiplication mod n. 

Field is an algebraic abstraction of the concept of number: in one we can add, subtract, 
multiply and divide, and all the usual properties like commutativity and distributivity 
are obeyed. Fields were also invented by Galois. C, R and Q are fields, while Z is not 
(you can’t always divide an integer by, for example, 3 and remain in Z). The integers 
mod n, i.e. Z,, are a field iff n is prime (e.g. in Z4, it is not possible to divide by the 
element [2] even though [2] 4 [0] there). C and R are examples of fields of characteristic 
0 — this means that 0 is the only integer k with the property that kx = 0 for all x in the 
field. We say Z, has characteristic p since multiplying by the integer p has the same 
effect as multiplying by 0. There is a finite field with q elements iff q is a power of a 
prime, in which case the field is unique and is called F}. Strange fields have important 
applications in, for example, coding theory and, ironically, in number theory itself — see 
Sections 1.7.1 and 2.4.1. 

The index of a subgroup H in G is the number of ‘cosets’ gH; for finite groups it 
equals ||G||/||H||. A normal subgroup N of a group is one obeying gNg~! = N for 
all g € G. Its importance arises because the set G/H of cosets gH has a natural group 
structure precisely when H is normal. If H is anormal subgroup of G we write H< G; 
if H is merely a subgroup of G we write H < G. The kernel ker(y) = g~!(ey) of a 
homomorphism g : G —> H is always normal in G, and Img = G/ker g. 
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By the free group F,, with generators {x,, ... , Xn} we mean the set of all possible words 
in the ‘alphabet’ xı, ee ..+,Xn,x,/, with group operation given by concatenation. The 
identity e is the empty word. The only identities obeyed here are the trivial ones coming 
from xix; j lx; = e. For example, F; = Z. The group P is already maximally 
complicated, in that all the other F,, arise as subgroups. 

We call a group G finitely generated if there are finitely many elements g1, . . . , &n € G 
such that G = (g1,..., &n), that is any g € G can be written as some finite word in the 
alphabet g7 Pea g*!. For example, any finite group is finitely generated, while the 
additive group R is not. Any finitely generated group G is the homomorphic image 
(Fn) of some free group Fn, i.e. G = F,/ker(y) (why?). This leads to the idea of 
presentation: G = (X |R) where X is a set of generators of G and F is a set of relations, 
that is words that equal the identity e in G. Enough words must be chosen so that ker g 
equals the smallest normal subgroup of F„ containing all of R. For example, here is a 
presentation for the dihedral group D, (the symmetries of the regular n-sided polygon): 


=X; 


D, = (a, b|a" = b? = abab = e). (1.1.1) 


For two interesting presentations of the trivial group G = {e}, see [416]. To define a 
homomorphism ø : G — H itis enough to give the value ø(g;) of each generator of G, 
and verify that g sends all relations of G to identities in H. 

We say G equals the (internal) direct product N x H of subgroups if every element 
g € G can be written uniquely as a product nh, for every n € N,h e H, and where 
N, H are both normal subgroups of G and N N H = {e}. Equivalently, the (external) 
direct product N x H of two groups is defined to be all ordered pairs (n, h), with 
operations given by (n, h)(n’, h') = (nn’, hh’); G will be the internal direct product 
of its subgroups N, H iff it is isomorphic to their external direct product. Of course, 
N G/H and H =G/N. Direct product is also called ‘homogeneous extension’ in 
the physics literature. 

More generally, G is an (internal) semi-direct product N xH of subgroups if all 
conditions of the internal direct product are satisfied, except that H need not be normal 
in G (but as before, N< G). Equivalently, the (external) semi-direct product N x9 H of 
two groups is defined to be all ordered pairs (n, h) with operation given by 


(n, h)(n', h’) = (n 6,(n'), hh’), 


where h +> 0n € Aut N can be any group homomorphism. It’s a good exercise to verify 
that N xH is a group for any such 0, and to relate the internal and external semi-direct 
products. Note that N = {(n, e1)}, H = {(en, h)} = G/N. Also, choosing the trivial 
homomorphism 6, = id. recovers the (external) direct product. The semi-direct product 
is also called the ‘inhomogeneous extension’. 

For example, the dihedral group is a semi-direct product of Z, with Z2. The group 
of isometries (distance-preserving maps) in 3-space is R? «({+/} x SO3), where R? 
denotes the additive subgroup of translations, —/ denotes the reflection x œ> —x through 
the origin, and SO; is the group of rotations. This continuous group is an example of a 
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Lie group (Section 1.4.2). Closely related is the Poincaré group, which is the semi-direct 
product of translations R* with the Lorentz group SO3,. 

Finally and most generally, if N is a normal subgroup of G then we say that G is an 
(internal) extension of N by the quotient group G/N. Equivalently, we say a group G is 
an (external) extension of N by H if each element g in G can be identified with a pair 
(n, h), forn € N and h € H, and where the group operation is 


(n, hX(n', h’) = (stuff, hh’), 


provided only that (n, ey )(n', ep) = (nn', ep). 

That irritating carry in base 10 addition, which causes so many children so much grief, 
is the price we pay for building up our number system by repeatedly extending by the 
group Zjo (one for each digit) (see Question 1.1.8(c)). 

A group G is abelian if gh = hg for all g, h € G. So Z, is abelian, but the symmetric 
group S, for n > 2 is not. A group is cyclic if it has only one generator. The only cyclic 
groups are the abelian groups Z, and Z. The centre Z(G) of a group is defined to be all 
elements g € G commuting with all other h € G; itis always a normal abelian subgroup. 


Theorem 1.1.1 (Fundamental theorem of finitely generated abelian groups) Let 
G be a finitely generated abelian group. Then 


GHZ! x Zm x +++ x Zn, 


where Z" =Zx.--- x Z(r times), and m; divides m which divides ...which divides 
mn. The numbers r, mj, h are unique. The group G is finite iffr = 0. 


The proof isn’t difficult — for example, see page 43 of [504]. Theorem 1.1.1 is closely 
related to other classical decompositions, such as that of the Jordan canonical form for 
matrices. 


1.1.2 Finite simple groups 


Theorem 1.1.1 gives among other things the classification of all finite abelian groups. In 
particular, the number of abelian groups G of order ||G|| = n = [] p P” i8 JI p P (ap), 
where P (m) is the partition number of m (the number of ways of writing m as a sum 
m=}; m,m >m2>---> 0). 

What can we say about the classification of arbitrary finite groups? This is almost 
certainly hopeless. All groups of order p or p* (for p prime) are necessarily abelian. The 
smallest non-abelian group is the symmetric group S3 (order 6); next are the dihedral 
group D, and the quaternion group Q4 = {+1, i, +j, +k} (both order 8). Table 1.1 
summarises the situation up to order 50 — for orders up to 100, see [418]. This can’t be 
pushed that much further, for example the groups of order 128 (there are 2328 of them) 
were classified only in 1990. One way to make progress is to restrict the class of groups 
considered. 


Every group has two trivial normal subgroups: itself and {e}. If these are the only 
normal subgroups, the group is called simple. It is conventional to regard the trivial 
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Table 1.1. The numbers of non-abelian groups of order < 50 


group {e} as not simple (just as ‘1’ is conventionally regarded as not prime). An alternate 
definition of a simple group G is that if g : G —> H is any homomorphism, then either 
o is constant (i.e. p(G) = {e}) or Q is one-to-one. 

The importance of simple groups is provided by the Jordan—Holder Theorem. By a 
“composition series’ for a group G, we mean a nested sequence 


G = Ho > Hı > H > --- > Ae > Hess = {e} (1.1.2) 


of groups such that H; is normal in H;—; (though not necessarily normal in H;—2), and 
the quotient H;—ı/H; (called a ‘composition factor’) is simple. An easy induction shows 
that any finite group G has at least one composition series. If Hj > --- > Hi}; = {e} 
is a second composition series for G, then the Jordan—Hélder Theorem says that k = £ 
and, up to a reordering 7 , the simple groups H;_,/H; and H} jail H} j are isomorphic. 
The cyclic group Z,, is simple iff n is prime. Two composition series of Zi. = (1) are 


Zy2 > (2) > (4) > (12), 
Zy2 > (3) > (6) > (12), 


corresponding to composition factors Z2, Z2, Z3 and Z3, Z2, Z2. This is reminiscent of 
2-2-3=3.-2-2 both being prime factorisations of 12. When all composition factors 
of a group are cyclic, the group is called solvable. The deep Feit-Thompson Theo- 
rem tells us that any group of odd order is solvable, as are all abelian groups and any 
group of order < 60 (Question 1.1.2). The name ‘solvable’ comes from Galois theory 
(Section 1.7.2). 

Finite groups are a massive generalisation of the notion of number. The number n 
can be identified with the cyclic group Z,. The divisor of a number corresponds to a 
normal subgroup, so a prime number corresponds to a simple group. The Jordan—Hélder 
Theorem generalises the uniqueness of prime factorisations. Building up any number by 
multiplying primes becomes building up a group by (semi-)direct products and, more 
generally, by group extensions. Note however that Ze x Z2 and S3 x Zo — both different 
from Z 12 — also have Z2, Z2, Z3 as composition factors. The lesson: unlike for numbers, 
‘multiplication’ here does not give a unique answer. The semi-direct product Z3 x Z can 
equal either Ze or S3, depending on how the product is taken. 

The composition series (1.1.2) tells us that the finite group G is obtained inductively 
from the trivial group {e} by extending {e} by the simple group H/Hgz+ı to get Hz, 
then extending H, by the simple group Hy_;/K, to get Hy_1, etc. In other words, any 
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finite group G can be obtained from the trivial group by extending inductively by simple 
groups; those simple groups are its ‘prime factors’ = composition factors. 

Thus simple groups have an importance for group theory approximating what primes 
have for number theory. One of the greatest accomplishments of twentieth-century math- 
ematics is the classification of the finite simple groups. Of course we would have preferred 
the complete finite group classification, but the simple groups are a decent compromise! 
This work, completed in the early 1980s (although gaps in the arguments are continually 
being discovered and filled [22]), runs to approximately 15000 journal pages, spread 
over 500 individual papers, and is the work of a whole generation of group theorists (see 
[256], [512] for historical remarks and some ideas of the proof). A modern revision is 
currently underway to simplify the proof and find and fill all gaps, but the final proof is 
still expected to be around 4000 pages long. The resulting list, probably complete, is: 


e the cyclic groups Z, (p a prime); 

e the alternating groups A, for n > 5; 
e 16 families of Lie type; 

e 26 sporadic groups. 


The alternating group A, consists of the even permutations in the symmetric group 
Sn, and so has order(=size) snl. The groups of Lie type are essentially Lie groups 
(Section 1.4.2) defined over the finite fields F}, sometimes ‘twisted’. See, for example, 
chapter I.4 of [92] for an elementary treatment. The simplest example is PSL,(F,), 
which consists of the n x n matrices with entries in F}, with determinant 1, quotiented 
out by the centre of SL, (IF, ) (namely the scalar matrices diag(a,a,...,a@) witha” = 1) 
(PSL2(Z2) and PSL2(Z3) aren’t simple so should be excluded). The ‘P’ here stands for 
‘projective’ and refers to this quotient, while the ‘S’ stands for ‘special’ and means 
determinant 1. 

The determinant det((g)) of any representation p (Section 1.1.3) of anoncyclic simple 
group must be identically 1, and the centre of any noncyclic simple group must be trivial 
(why?). Hence in the list of simple groups of Lie type are found lots of P’s and S’s. 

The smallest noncyclic simple group is As, with order 60. It is isomorphic to PSL2(Zs) 
and PSL2(F4), and can also be interpreted as the group of all rotational symmetries of a 
regular icosahedron (reflections have determinant — 1 and so cannot belong to any simple 
group Ž Z2). The simplicity of As is ultimately responsible for the fact that the zeros of 
a general quintic polynomial cannot be solved by radicals (see Section 1.7.2). 

The smallest sporadic group is the Mathieu group M11, order 7920, discovered in 
1861.” The largest is the Monster M,’ conjectured independently by Fischer and Griess 


2 although his arguments apparently weren’t very convincing. In fact some people, including the Camille 


Jordan of Jordan—Hélder fame, argued in later papers that the largest of Mathieu’s groups, M,,, couldn’t 
exist. We now know it does, for example an elegant realisation is as the automorphisms of Steiner system 
S(5,8,24). 

3 Griess also came up with the symbol for the Monster; Conway came up with the name. It’s a little 
unfortunate (but perhaps inevitable) that the Monster is not named after its codiscoverers, Berndt Fischer 
and Robert Griess; the name ‘Friendly Giant’ was proposed in [263] as a compromise, but ‘Monster’ stuck. 
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in 1973 and finally proved to exist by Griess [263] in 1980. Its order is given in (0.2.2). 
20 of the 26 sporadic groups are involved in (i.e. are quotients of subgroups of) the 
Monster, and play some role in Moonshine, as we see throughout Section 7.3. We study 
the Monster in more detail in Section 7.1.1. Some relations among M, the Leech lattice 
A and the largest Mathieu group Mo, are given in chapters 10 and 29 of [113]. We collect 
together some of the data of the sporadics in Table 7.1. 

This work reduces the construction and classification of all finite groups to under- 
standing the possible extensions by simple groups. Unfortunately, group extensions turn 
out to be technically quite difficult and lead one into group cohomology. 

There are many classifications in mathematics. Most of them look like phone books, 
and their value is purely pragmatic: for example, as a list of potential counterexamples, 
and as a way to prove some theorems by exhaustion. And of course obtaining them 
requires at least one paper, and with it some breathing space before those scoundrels on 
the grant evaluation boards. But when the classification has structure, it can resemble 
in ways a tourist guide, hinting at new sites to explore. The 18 infinite families in the 
finite simple group classification are well known and generic, much like the chain of 
MacDonald’s restaurants, useful and interesting in their own ways. But the eye skims 
over them, and is drawn instead to the 26 sporadic groups and in particular to the largest: 
the Monster. 


1.1.3 Representations 


Groups typically arise as ‘things that act’. This is their raison ď être. For instance, the 
symmetries of a square form the dihedral group D4 — that is, the elements of D4 act on 
the vertices by permuting them. When a group acts on a structure, you generally want 
it to preserve the essential features of the structure. In the case of our square, we want 
adjacent vertices to remain adjacent after being permuted. 

So when a group G acts on a vector space V (over C, say), we want it to act ‘linearly’. 
The action g.v of G on V gives V the structure of a G-module. In completely equivalent 
language, it is a representation p of G on V = C”, that is as a group homomorphism 
from G to the invertible matrices GL,,(C). So a representation p is a realisation of the 
group G by matrices, where multiplication in G corresponds to matrix multiplication: 


pgh) = p(g) p(h). 


The identification of V with C” is achieved by choosing a basis of V, so the module 

language is ‘cleaner’ in the sense that it is basis-independent, but this also tends to make 

it less conducive for practical calculations. The module action g.v is now written p(g)v, 

where v is the column vector consisting of the components of v € V with respect to the 

given basis. If o(g) are n x n matrices, we say p is an n-dimensional representation. 
For a practise example, consider the symmetric group 


S3 = {(1), (12), (23), (13), (123), (132)}. (1.1.3) 
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These cycles multiply as (13)(123) = (12). One representation of S3 is one-dimensional, 
and sends all six elements of S; to the 1 x 1 identity matrix: 


pila) = (1), Vo € 83. 


Obviously (1.1.3) is satisfied, and so this defines a representation. But it’s trivial, pro- 
jecting away all structure in the group S3. Much more interesting is the defining repre- 
sentation 03, which assigns to each ø € S3 a 3 x 3 permutation matrix by using o to 
permute the rows of the identity matrix /. For example 


O 1 0 0 0 1 0 0 1 
dJjri{ttooj;y, d3r{to 1 0|, d23)r% {1 0 0 
0 0 1 1 0 0 0 1 0 


This representation is faithful, that is different permutations o are assigned different 
matrices p3(0). From this defining representation p3, we get a second one-dimensional 
one — called the sign representation p, — by taking determinants. For example, (1) —> 
(+1), (12) > (-1), (13) > (—1) and (123) B (+1). 

The most important representation associated with a group G is the regular representa- 
tion given by the group algebra CG. That is, consider the ||G ||-dimensional vector space 
(over C, say) consisting of all formal linear combinations >> neg &nh, where a, € C. 
This has a natural structure of a G-module, given by g. anh) = X, angh. 

When G is infinite, there will be convergence issues and hence analysis since infinite 
sums >> œ,h are involved. The most interesting possibility is to interpret h œ> a, as a 
C-valued function a(h) on G. Suppose we have a G-invariant measure du on this space 
of functions a : G —> C -this means that the integral J y &(h) du(h) will exist and equal 
JS y &(gh) du(h) whenever the latter exists. For example, if G is discrete, define ‘ JS gad’ 
to be $., cg a(n), while if G is the additive group R, du(x) is the Lebesgue measure (see 
Section 1.3.1). Looking at the g-coefficient of the product (X`, anh)(>_, Brk), we get 
the formula g +> J` a, B;,-12, which we recognise as the convolution product (recall (œ x 
Bx) = f a(x) B(x — y) dy) in, for example, Fourier analysis. In this context, the regular 
representation of G becomes the Hilbert space L?(G) of square-integrable functions 
(i.e. f jæ|?du < 00); the convolution product defines an action of L?(G) on itself. Note 
however that the L?(R)-module L?(R), for a typical example, doesn’t restrict to an 
R-module: the action of R on a € L?(R) by (x.a)(y) = a(x + y) corresponds to the 
convolution product of œ with the “Dirac delta’ distribution ô centred at x. We return to 
L?(G) in Section 1.5.5. 

Two representations p, p’ are called equivalent if they differ merely by a change 
of coordinate axes (basis) in the ambient space C”, that is if there exists a matrix U 
such that p’(g) = Up(g)U~! for all g. The direct sum p' ® p” of representations is 
given by 


(o' @ pig) = k nt eo . (1.1.4a) 
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The tensor product p’ ® p” of representations is given by (p’ ® p”)(g) = p'(g) 8 p” (8), 
where the Kronecker product A ® B of matrices is defined by the following block form: 


aB annB 
AQB= |anB annB :-- |. (1.1.4b) 


The contragredient or dual p* of a representation is given by the formula 


p*(g) = TY, (1.1.4c) 


so called because it’s the natural representation on the space V* dual to the space V on 
which p is defined. For any finite group representation defined over a subfield of C, the 
dual p* is equivalent to the complex conjugate representation g œ> p(g). 

Returning to our S3 example, the given matrices for 3 were obtained by having S3 act 
on coordinates with respect to the standard basis {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. If instead 
we choose the basis {(1, 1, 1), (1, —1, 0), (0, 1, —1)}, these matrices become 


1 0 0 1 0 0 1 0 0 
d2)=>= |0 -1 1], aŅ=>Į|0 0 -1], G23) 7,0 0 -1 
0 0 1 0 -l1 0 0 1 -1 


It is manifest here that p3 is a direct sum of pı (the upper-left 1 x 1 block) with a 
two-dimensional representation p2 (the lower-right 2 x 2 block) given by 


-1 1 0 -l 0 -l 
ae (7 D a» (2 a) ase (4 aa) 


An irreducible or simple module is a module that contains no nontrivial submodule. 
‘Submodule’ plays the role of divisor here, and ‘irreducible’ the role of prime number. A 
module is called completely reducible if it is the direct sum of finitely many irreducible 
modules. For example, the S3 representations p1, Ps and p2 are irreducible, while p3 = 
pı ® p2 is completely reducible. 

A representation is called unitary if it is equivalent to one whose matrices p(g) are 
all unitary (i.e. their inverses equal their complex conjugate transposes). A more basis- 
independent definition is that a G-module V is unitary if there exists a Hermitian form 
(u, v) € Con V such that 


(g.u, g.v) = (u, v). 
By definition, a Hermitian form (u,v) : V x V — C is linear in v and anti-linear in u, 
i.e. 
(au +a'u', bv +b'v') = ab(u, v) + ab! (u, v') + a'b(u', v) +a'b'(u', v’), 


for all a,a', b, b' € C, u, u', v, v' € V, and finally (u, u) > 0 for all nonzero u € V. 
When V is finite-dimensional, a basis can always be found in which its Hermitian 
form looks like (x, y) = }_; X; y:. Most representations of interest in quantum physics 
are unitary. Unitary representations are much better behaved than non-unitary ones. 
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For instance, an easy argument shows that finite-dimensional unitary representation is 
completely reducible. 

An indecomposable module is one that isn’t the direct sum of smaller ones. An inde- 
composable module may be reducible: its matrices could be put into the form 


_ (Ag) BQ) 
ate) = ( 0 oo) 


where for some g the submatrix B(g) isn’t the 0-matrix (otherwise we would recover 
(1.1.4a)). Then A(g) is a subrepresentation, but D(g) isn’t. For finite groups, however, 
irreducible=indecomposable: 


Theorem 1.1.2 (Burnside, 1904) Let G be finite and the field be C. Any G-module is 
unitary and will be completely reducible if it is finite-dimensional. There are only finitely 
many irreducible G-modules; their number equals the number of conjugacy classes 
of G. 


The conjugacy classes are the sets K, = {h~'gh|h € G}. This fundamental result 
fails for infinite groups. For example, take G to be the additive group Z of inte- 
gers. Then there are uncountably many one-dimensional representations of G, and 
there are representations that are reducible but indecomposable (see Question 1.1.6(a)). 
Theorem 1.1.2 is proved using a projection defined by certain averaging over G, as well 
as: 


Lemma 1.1.3 (Schur’s Lemma) Let G be finite and p, p' be representations. 

(a) p is irreducible iff the only matrices A commuting with all matrices p(g), g € G — 
that is Ap(g) = p(g)A — are of the form A = al for a € C, where I is the identity 
matrix. 

(b) Suppose both p and p' are irreducible. Then p and p' are isomorphic iff there is a 
nonzero matrix A such that Ap(g) = p'(g)A for all g € G. 


Schur’s Lemma is an elementary observation central to representation theory. It’s proved 
by noting that the kernel (nullspace) and range (column space) of A are G-invariant. 
The character* ch, of a representation p is the map G — C given by the trace: 


chp(g) = tr (p(g)). (1.1.5) 


We see that equivalent representations have the same character, because of the fundamen- 
tal identity tr(AB) = tr(B A). Remarkably, for finite groups (and C), the converse is also 
true: inequivalent representations have different character. That trace identity also tells us 
that the character is a ‘class function’, i.e. ch, (hgh!) = tr(p(h) p(g) p(A)"!) = ch,(g) 
so ch, is constant on each conjugacy class K ¿. Group characters are enormously simpler 
than representations: for example, the smallest nontrivial representation of the Monster 


4 Surprisingly, characters were invented before group representations, by Frobenius in 1868. He defined 
characters indirectly, by writing the ‘class sums’ C; in terms of the idempotents of the centre of the group 
algebra. It took him a year to realise they could be reinterpreted as the traces of matrices. 
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Table 1.2. The character table of S3 


ch\o (1) (12) (123) 
ch, 1 1 1 
ch, 1 —1 1 
chy 2 0 =j 


M consists of about 10° matrices, each of size 196 883 x 196 883, while its character 
consists of 194 complex numbers. The reason is that the representation matrices have a lot 
of redundant, basis-dependent information, to which the character is happily oblivious. 

The Thompson trick mentioned in Section 0.3 tells us: A dimension can (and should) 
be twisted; that twist is called a character. Indeed, ch,(e) = dim(p), where the dimension 
of p is defined to be the dimension of the underlying vector space V, or the size n of the 
n x n matrices p(g). When we see a positive integer, we should try to interpret it as a 
dimension of a vector space; if there is a symmetry present, then it probably acts on the 
space, in which case we should see what significance the other character values may have. 

Algebra searches for structure. What can we say about the set of characters? First, 
note directly from (1.1.4) that we can add and multiply characters: 


chpep' (8) = ch,(g) + chy(g), (1.1.6a) 
Chygpr(g) = chp(g) chy(g), (1.1.6b) 
chp+(g) = chp(g). (1.1.6c) 


Therefore the complex span of the characters forms a (commutative associative) algebra. 
For G finite (and the field algebraically closed), each matrix p(g) is separately diagonal- 
isable, with eigenvalues that are roots of 1 (why?). This means that each character value 
ch,(g) is a sum of roots of 1. 

By the character table of a group G we mean the array with rows indexed by the 
characters ch, of irreducible representations, and the columns by conjugacy classes Kz, 
and with entries ch,(g). An example is given in Table 1.2. Different groups can have 
identical character tables: for instance, for any n, the dihedral group D4, has the same 
character table as the quaternionic group Q4n defined by the presentation 


Qan = (a, b|a* = b”, abab = e). (1.1.7) 


In spite of this, the characters of a group G tell us much about G — for example, its order, 
all of its normal subgroups, whether or not it’s simple, whether or not it’s solvable ... 
In fact, the character table of a finite simple group determines the group uniquely [100] 
(its order alone usually distinguishes it from other simple groups). This suggests: 


Problem Suppose G and H have identical character tables (up to appropriate per- 
mutations of rows and columns). Must they have the same composition factors? 


After all, the answer is certainly yes for solvable G (why?). 
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It may seem that ‘trace’ is a fairly arbitrary operation to perform on the matrices 
p(g) — certainly there are other invariants we can attach to a representation p so that 
equivalent representations are assigned equal numbers. For example, how about g t> 
detp(g)? This is too limited, because it is a group homomorphism (e.g. what happens 
when G is simple?). But more generally, choose an independent variable x, for each 
element g € G, and for any representation p of G define the group determinant of p 


©, = det (x Xg ns) : 
geG 

This is a multivariable polynomial ©,, homogeneous of degree n = dim(p). The char- 
acter ch,(g) can be obtained from the group determinant ©,,: it is the coefficient of the 
ae ae term. In fact, the group G is uniquely determined by the group determinant of 
the regular representation CG. See the review article [315]. 

One use of characters is to identify representations. For this purpose the orthogonality 
relations are crucial: given any characters ch, ch’ of G, define the Hermitian form 


(ch, ch’) ` ch(g)ch’(g). (1.1.8a) 
siel os 


Write p; for the irreducible representations, and ch; for the corresponding traces. Then 
(ch;, chy) = ôij, (1.1.8b) 


that is the irreducible characters ch; are an orthonormal basis with respect to (1.1.8a). If 
the rows of a matrix are orthonormal, so are the columns. Hence (1.1.8b) implies 


2 h (e)ch;(h) ~ TK, l ÔK, Kr- (1.1.8c) 


The decomposition of CG into irreducibles is now immediate: 
CG = adim py) ps, 


that is each irreducible representation appears with multiplicity given by its dimension. 
Taking the dimension of both sides, we obtain the useful identity 


IGI = $ (dim py 


The notion of vector space and representation can be defined over any field K. One 
thing that makes representations over, for example, the finite field K = Z, much more 
difficult is that characters no longer distinguish inequivalent representations. For instance, 
take G = {e} and consider the representations 


- OO 


1 0 
py=(1) and p'(1)=]{ 0 1 
0 0 
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Fig. 1.2 The relation 02030) = 030703 in By. 


These are certainly different representations — their dimensions are different. But over 
the field Z2, their characters ch and ch’ are identical. Theorem 1.1.2 also breaks down 
here. Unless otherwise stated, in this book we restrict to characteristic 0 (but see modular 
Moonshine in Section 7.3.5). 


1.1.4 Braided #1: the braid groups 


Fundamental to us are the braid groups, especially 63. By an n-braid we mean n non- 
intersecting strands as in Figure 1.1. We are interested here in how the strands interweave, 
and not how they knot, and so we won’t allow the strands to double-back on themselves. 
We regard two n-braids as equivalent if they can be deformed continuously into each 
other — we make this notion more precise in Section 1.2.3. The set of equivalence classes 
of n-braids forms a group, called the braid group Ba, with multiplication given by vertical 
concatenation, as in Figure 1.1. 
Artin (1925) gives a very useful presentation of B,,: 


By = (01, ..., On-1 | jj = OjO;, OiOi+10i = O7410;0;41, Whenever |i — j| > 2). 


(1.1.9) 


Here o; denotes the braid obtained from the identity braid by interchanging the ith and 
(i + 1)th strands, with the ith strand on top. See Figure 1.2 for an illustration. 
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Of course 8, is trivial and 6, = Z, but the other 5,, are quite interesting. Any non- 
trivial element in $, has infinite order. Let o = 010) ---0,_1; then oo; = 0,410, so the 
generators are all conjugate and the braid Z = o” lies in the centre of 6,,. In fact, for 
n > 2 the centre Z(6,) = Z, and is generated by that braid Z. We’re most interested in 
B3: then Z = (o;00)° generates the centre, and we will see shortly that 


B3/(Z*) S SL(Z), (1.1.10a) 
B3/(Z) S PSL,(Z). (1.1.10b) 


There is a surjective homomorphism @ : B, —> S, taking a braid « to the permutation 
(a) € Sn, where the strand of œ starting at position i on the top ends on the bottom 
at position ¢(@)(i). For example, #(0;) is the transposition (i, i + 1). The kernel of ¢ is 
called the pure braid group P,,. A presentation for P, is given in lemma 1.8.2 of [59]. 
We find that P) = (o?) = Z and 


P3 = (0f, 03, Z) S P x Z. (1.1.10c) 


Another obvious homomorphism is the degree map deg : B, — Z, defined by 
deg(a*! = +1. It is easy to show using (1.1.9) that ‘deg’ is well defined and is the 
number of signed crossings in the braid. Its kernel is the commutator subgroup [B,,5,,] 
(see Question 1.1.7(a)). 

The most important realisation of the braid group is as a fundamental group (see 
(1.2.6)). It is directly through this that most appearances of B, in Moonshine-like phe- 
nomena arise (e.g. Jones’ braid group representations from subfactors, or Kohno’s from 
the monodromy of the Knizhnik—Zamolodchikov equation). 

The relation of 63 to modularity in Moonshine, however, seems more directly to 
involve the faithful action of $, on the free group Fn = (x1, ..., Xn) (see Question 6.3.5). 
This action allows us to regard $, as a subgroup of Aut F,,. 

As is typical for infinite discrete groups, B, has continua of representations. For 


instance, there is a different one-dimensional for every choice of nonzero complex num- 
ber w Æ 0, namely œ +> w®8 , It seems reasonable to collect these together and regard 


them as different specialisations of a single one-dimensional C[w*']-representation, 


which we could call w“°£, where C[w*!] is the (Laurent) polynomial algebra in w and 


wl, 


The Burau representation (Burau, 1936) of B, is an n-dimensional representation with 


entries in the Laurent polynomials C[w*'], and is generated by the matrices 


ar tie(' 7” A J@n (1.1.11a) 


where J, here denotes the k x k identity matrix. C[w*'] isn’t a field, but checking 
determinants confirms that all matrices o(0;) are invertible over it. The Burau represen- 
tation is reducible — in particular the column vector v = (1, 1, ..., 1)’ is an eigenvector 
with eigenvalue 1, for all the matrices in (1.1.11a), and hence $, acts trivially on the 
subspace Cv. The remaining (n — 1)-dimensional representation is the reduced Burau 
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representation. For example, for B; it is 


ee sesso. a (1.1.11b) 
0 1 w —w 


and so the centre-generator Z maps to the scalar matrix w7 . Note that the specialisation 
w = —1 has image SL,(Z) — in fact it gives the isomorphism (1.1.10a) — while w = 1 
has image S3 and is the representation p2. 

There are many natural ways to obtain the representation (1.1.1 1a). The simplest uses 
~ acting in the obvious way on the group algebra CF,,. To any n-braid 
a € B, define the n x n matrix whose (i, /)-entry is given by 


derivatives 


0 
= (@.x;), 
OX; 


ws 


where @.x; denotes the action of B, on F, and where w®8 is the obvious representation 
of F,,, extended linearly to CF,,. Then this recovers (1.1.11a). 

All irreducible representations of 63 in dimension < 5 are found in [531]. Most are 
non-unitary. For example, any two-dimensional irreducible representation is of the form 


oe Ay Ag cet à 0 
i 0 de)” : -ài à” 
for some nonzero complex numbers A,,A2 (compare (1.1.11b)). This representation 


will be unitary iff both |A,| = |A2| = 1 and à; /à2 = e" for 1/3 < t < 57/3. Not all 
representations of $83 are completely reducible, however (Question 1.1.9). 


Question 1.1.1. Identify the group PSL2(Z2) and confirm that it isn’t simple. 


Question 1.1.2. If G and H are any two groups with ||G|| = || H || < 60, explain why 
they will have the same composition factors. 


Question 1.1.3. Verify that the dihedral group D, (1.1.1) has order 2n. Find its compo- 
sition factors. Construct D,, as a semi-direct product of Z and Z,. 


Question 1.1.4. (a) Using the methods and results given in Section 1.1.3, compute the 
character table of the symmetric group S4. 

(b) Compute the tensor product coefficients of S4. That is, if 0), p2, .. . are the irreducible 
representations of S4, compute the multiplicities T defined by 


Pi D pj = Th pr- 


Question 1.1.5. Prove that ch(g~!) = ch(g). Can you say anything about the relation of 
ch(g°) and ch(g), for other integers £? 


Question 1.1.6. (a) Find a representation over the field C of the additive group G = Z, 
which is indecomposable but not irreducible. Hence show that inequivalent (complex) 
finite-dimensional representations of Z can have identical characters. 

(b) Let p be any prime dividing some n € N. Find a representation of the cyclic group 
G = Z, over the field K = Z,, which is indecomposable but not irreducible. 
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Question 1.1.7. (a) Let G be any group, and define the commutator subgroup [G, G] 
to be the subgroup generated by the elements ghg—'h~', for all g, h € G. Prove that 
[G, G] is anormal subgroup of G, and that G/[G, G] is abelian. (In fact, G/[G, G] is 
isomorphic to the group of all one-dimensional representations of G.) 

(b) Show that the free groups Fn = Fm iff n = m, by using Theorem 1.1.2. 


Question 1.1.8. (a) Explicitly show how the semi-direct product Z3x19Z 2 can equal Ze 
or S3, depending on the choice of 0. 

(b) Show that Z2*gH = Z x H, for any group H and homomorphism 9. 

(c) Hence Z4 can’t be written as a semi-direct product of Z with Z2. Explicitly construct 
it as an external group extension of Z2 by Zo. 


Question 1.1.9. Find a two-dimensional representation of the braid group $; that is not 
completely reducible. 


1.2 Elementary geometry 


Geometry and algebra are opposites. We inherited from our mammalian ancestors our 
subconscious facility with geometry; to us geometry is intuitive and has implicit meaning, 
but because of this it’s harder to generalise beyond straightforward extensions of our 
visual experience, and rigour tends to be more elusive than with algebra. The power and 
clarity of algebra comes from the conceptual simplifications that arise when content is 
stripped away. But this is equally responsible for algebra’s blindness. Although recently 
physics has inspired some spectacular developments in algebra, traditionally geometry 
has been the most reliable star algebraists have been guided by. We touch on geometry 
throughout this book, though for us it adds more colour than essential substance. 


1.2.1 Lattices 


Many words in mathematics have multiple meanings. For example, there are vector fields 
and number fields, and modular forms and modular representations. ‘Lattice’ is another 
of these words: it can mean a ‘partially ordered set’, but to us a lattice is a discrete 
maximally periodic set — a toy model for everything that follows. 

Consider the real vector space R””: its vectors look like x = (x4; x—), where x, and 
x_ are m- and n-component vectors, respectively, and inner-products are given by x - y = 
X+- y+ —XxX_-+y_. The inner-products x4 -y+ are given by the usual X; (x4) O+). 
For example, the familiar Euclidean (positive-definite) space is R” = R™?, while the 
Minkowski space-time of special relativity is R*!. 

Now choose any basis 8 = {b,..., b"*”} in R™” , If we consider all possible linear 
combinations J; a;ib® over the real numbers R, then we recover R™”; if instead we 
consider linear combinations over the integers only, we get a lattice. 


Definition 1.2.1 Let V be any n-dimensional inner-product space, and let {b, 
...,b} be any basis. Then L(B) := Zb + -- -+ Zb™ is called a lattice. 
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Fig. 1.3 Part of the A, disc packing. 


A lattice is discrete and closed under sums and integer multiples. For example, Z’”’” is 
a lattice (take the standard basis in R™”). A more interesting lattice is the hexagonal 
lattice (also called A2), given by the basis 6 = {(%2, %5), (v2, 0)} of R? - try to plot 
several points. If you wanted to slide a bunch of identical coins on a table together as 
tightly as possible, their centres would form the hexagonal lattice (Figure 1.3). Another 
important lattice is Z1, C REL, given by B = (i: z) (z: Bh equivalently, it 
can be thought of as the set of all pairs (a, b) € Z? with inner-product 


(a, b) - (c,d) = ad + bc. (1.2.1) 


Different bases may or may not result in a different lattice. For a trivial example, 
consider 6 = {1} and p’ = {—1} in R = R!°: they both give the lattice Z = Z'°. Two 
lattices L(B) C V and L(f’) C V’ are called equivalent or isomorphic if there is an 
orthogonal transformation T : V —> V’ such that the lattices T(L(B)) and L(’) are 
identical as sets, or equivalently if b; =T> jei jbj, for some integer matrix C = (cj;) € 
GL, (Z) with determinant +1. 

This notion of lattice equivalence is important in that it emphasises the essential 


properties of a lattice and washes away the unpleasant basis-dependence of Definition 
1.2.1. In particular, the ambient space V in which the lattice lives, and the basis 8, are 
non-essential. The transformation T tells us we can change V , and C is a change-of-basis 
matrix for which both C and C7! are defined over Z. 

For example, 6B = (eA a)» (E 5) in R? yields a lattice equivalent to Z*. The 
basis 6’ = {(—1, 1, 0), (0, —1, 1)} for the planea+b+c=Oin R? yields the lattice 
L(B) = {(a,b,c)€ Blatb+c =0}, equivalent to the hexagonal lattice A. 

The dimension of the lattice is the dimension dim(V ) of the ambient vector space. The 
lattice is called positive-definite if it lies in some R” (i.e. n = 0), and integral if all inner- 
products x - y are integers, for x, y € L. A lattice L is called even if it is integral and in 
addition all norm-squareds x - x are even integers. For example, Z”™” is integral but not 
even, while Az and //, ; are even. The dual L* of a lattice L consists of all vectors x € V 
such that x - L C Z. A natural basis for the dual L(f)* is the dual basis B*, consisting 
of the vectors c; € V obeying b; - cj = 4;; for alli, j. A lattice is integral iff L C L*. A 
lattice is called self-dual if L = L*. The lattices Z”” and II; ı are self-dual, but Az is 
not. We are most interested in even positive-definite lattices. 

To any n-dimensional lattice L(6), define an n x n matrix A (called a Gram matrix) 
by A;; = bi - bj. Two lattices with identical Gram matrices are necessarily equivalent, 
but the converse is not true. Note that the Gram matrix of L(8*) is the inverse of the Gram 
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Fig. 1.4 The graphs with largest eigenvalue < 2. 


matrix for L(6). The determinant |L| of a lattice is the determinant of the Gram matrix; 
geometrically, it is the volume-squared of the fundamental parallelepiped of L defined by 
the basis. This will always be positive if L is positive-definite. The determinant of a lattice 
is independent of the specific basis £ chosen; equivalent lattices have equal determinant, 
though the converse isn’t true. An integral lattice L is self-dual iff |L| = +1. If L’ C L 
are of equal dimension, then the quotient L/L’ is a finite abelian group of order 


L/L’ = VILT] € N. (1.2.2) 


Given two lattices L, L’, their (orthogonal) direct sum L @ L’ is defined to consist 
of all pairs (x, x’), for x € L, x’ € L’, with inner-product defined by (x, x’) - (y, y) = 
x- y +x- y'. The dimension of L @ L’ is the sum of the dimensions of L and L’. The 
direct sum L @ L’ will be integral (respectively self-dual) iff both L and L’ are integral 
(respectively self-dual). 

An important class of lattices are the so-called root lattices An, Dn, E6, E7, Eg asso- 
ciated with simple Lie algebras (Section 1.5.2). They can be defined from the graph 
(‘Coxeter-Dynkin diagram’) in Figure 1.4 (but ignore the ‘tadpole’ T, for now): label 
the nodes of such a graph from 1 to n, put A;; = 2 and put A;; = —1 if nodes i and j 
are connected by an edge. Then this matrix A (the Cartan matrix of Definition 1.4.5) 
is the Gram matrix of a positive-definite integral lattice. Realisations of some of these 
are given shortly; bases can be found in table VII of [214], or planches I-VII of [84]. 
Of these, Eg is the most interesting as it is the even self-dual positive-definite lattice of 
smallest dimension. 


The following theorem characterises norm-squared 1,2 vectors. 


Theorem 1.2.2 Let L be an n-dimensional positive-definite integral lattice. 

(a) Then L is equivalent to the direct sum Z” ® L', where L has precisely 2m unit 
vectors and L’ has none. 

(b) If L is spanned by its norm-squared 2 vectors, then L is a direct sum of root lattices. 


Theorem 1.2.2(b) gives the point-of-contact of Lie theory and lattices. The densest 
packing of circles in the plane (Figure 1.3) is A2, in the sense that the centres of these 
circles are the points of A2. The obvious pyramidal way to pack oranges is also the 
densest, and likewise gives the A3 root lattice. The densest known sphere packings in 
dimensions 4, 5, 6, 7, 8 are the root lattices D4, Ds, E6, E7, Eg, respectively. 

The Leech lattice A is one of the most distinguished lattices, and like Eg 
is directly related to Moonshine. It can be constructed using ‘laminated lattices’ 
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({113], chapter 6). Start with the zero-dimensional lattice Lo = {0}, consisting of just 
one point. Use it to construct a one-dimensional lattice L;, with minimal (nonzero) norm 
2, built out of infinitely many copies of Lọ laid side by side. The result of course is simply 
the even integers 2Z. Now construct a two-dimensional lattice L2, of minimal norm 2, 
built out of infinitely many copies of Lı stacked next to each other. There are lots of 
ways to do this, but choose the densest lattice possible. The result is the hexagonal lattice 
Az rescaled by a factor of /2. Continue in this way: L3, La, Ls, Le, L7 and Lg are the 
root lattices A3, D4, Ds, E6, E7 and Eg, respectively, all rescaled by VON 

The 24th repetition of this construction yields uniquely the Leech lattice A = L34. It 
is the unique 24-dimensional even self-dual lattice with no norm-squared 2-vectors, and 
provides among other things the densest known packing of 23-dimensional spheres S% 
in R**. It is studied throughout [113]. After dimension 24, chaos reigns in lamination (23 
different 25-dimensional lattices have an equal right to be called L25, and over 75 000 
are expected for L26). So lamination provides us with a sort of no-input construction of 
the Leech lattice. Like the Mandelbrot set, the Leech lattice is a subtle structure with an 
elegant construction — a good example of the mathematical meaning of ‘natural’. 

Question 1.2.1 asks you to come up with a definition for the automorphism group of 
a lattice. An automorphism is a symmetry, mapping the lattice to itself, preserving all 
essential lattice properties. It is how group theory impinges on lattice theory. 

Most (positive-definite) lattices have trivial automorphism groups, consisting only 
of the identity and the reflection x +» — x through the origin. But the more interesting 
lattices tend to have quite large groups. The reflection through the hyperplane orthogonal 
to anorm-squared 2-vector in an integral lattice defines an automorphism; together, these 
automorphisms form what Lie theory calls a Weyl group. 

Typically the Weyl group has small index in the full automorphism group, though a 
famous counterexample is the Leech lattice (which, as we know, has trivial Wey] group). 
Its automorphism group is denoted Coo and has approximately 8 x 10!8 elements. The 
automorphism x +» —x lies in its centre; if we quotient by this 2-element centre we get 
a sporadic simple group Co. Define C 02 and Co; to be the subgroups of Cog consisting 
of all g € Cog fixing some norm-squared 4-vector and some norm-squared 6-vector, 
respectively. These three groups Co,, Coz, Co3 are all simple. In fact, a total of 12 
sporadic finite simple groups appear as subquotients in Cog, and can best be studied 
geometrically in this context. Gorenstein [256] wrote: 


...1f Conway had studied the Leech lattice some 5 years earlier, he would have 
discovered a total of 7 new simple groups! Unfortunately he had to settle for 3. 
However, as consolation, his paper on .O[=Coo] will stand as one of the most 
elegant achievements of mathematics. 


1.2.2 Manifolds 


On what structures do lattices act naturally? An obvious place is on their ambient space 
(R", say). They act by addition. Quotient out by this action. Topologically, we have 
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Fig. 1.5 A coordinate patch. 


created a manifold (to be defined shortly); to each point on this manifold corresponds an 
orbit in R” of our lattice action. 

Consider first the simplest case. The number n € Z acts on R by sending x € R 
to x +n. The orbits are the equivalence classes of the reals mod 1. We can take as 
representatives of these equivalence classes, that is as points of R/Z, the half-open 
interval [0, 1). This orbit space inherits a topology (i.e. a qualitative notion of points 
being close; for basic point-set topology see e.g. [481], [104]), from that of R, and 
this is almost captured by the interval [0, 1). The only problem is that the orbit of 
0.999 = —0.0001 is pretty close to that of 0, even though they are at opposite ends of 
the interval. What we should do is identify the two ends, i.e. glue together 0 and 1. The 
result is a circle. 

We say that R/Z is topologically the circle S1. The same argument applies to R"/L, 
and we get the n-torus S! x --- x S! (see Question 1.2.2). 

The central structure in geometry is a manifold — geometries where calculus is possible. 
Locally, a manifold looks like a piece of R” (or C”), but these pieces can be bent and 
stitched together to create more interesting shapes. For instance, the n-torus is an n- 
dimensional manifold. The definition of manifold, due to Poincaré at the turn of the 
century, is a mathematical gem; it explains how flat patches can be sewn together to 
form smooth and globally interesting shapes. 


Definition 1.2.3 AC manifold M is a topological space with a choice of open sets 
Ux C M, Va C R” and homeomorphisms Qa : Uy —> Va, as in Figure 1.5, such that 
the Ua cover M (i.e. M = UgU,) and whenever Ug N Ug, the map Pa © Op isaCc® 
map from some open subset (namely pg(Ua N Ug)) of Vg to some open subset (namely 
Ya Ue N Ug)) of Va. 


A homeomorphism means an invertible continuous map whose inverse is also contin- 
uous. By a C% map f between open subsets of R”, we mean that 


FO = ii, es Xn) oos fn, «++ 5 Xn) 


is continuous, and all partial derivatives +- ne fj exist and are also continuous. 

This is the definition of a real manifold; a complex manifold is similar. An n- 
dimensional complex manifold is a 2n-dimensional real one. A one-dimensional mani- 
fold is called a curve, and a two-dimensional one a surface. ‘Smooth’ is often used for 


Ge. 
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Using Yx, each ‘patch’ U, C M inherits the structure of V, C R”. For instance, we 
can coordinatise V, and do calculus on it, and hence we get coordinates for, and can 
do calculus on, Ua. The overlap condition for gy © 5 guarantees compatibility. For 
example, the familiar latitude/longitude coordinate system comes from covering the 
Earth with two coordinate patches V; — one centred on the North pole and the other on 
the South, and both stretching to the Equator — with polar coordinates chosen on each V;. 

More (or less) structure can be placed on the manifold, by constraining the overlap 
functions Px © Op more or less. For example, a ‘topological manifold’ drops the C® 
constraint; the result is that we can no longer do calculus on the manifold, but we can 
still speak of continuous functions, etc. A conformal manifold requires that the overlap 
functions preserve angles in R” — the angle between intersecting curves in R” is defined 
to be the angle between the tangents to the curves at the point of intersection. Conformal 
manifolds inherit the notion of angle from R”. Stronger is the notion of Riemannian 
manifold, which also enables us to speak of length. 

It is now easy to compare structures on different manifolds. For instance, given two 
manifolds M, M’, a function f : M — M' is ‘C°” if each composition pg o f o gzis 
a C% map from some open subset of Vg to Ves M and M’ are C% -diffeomorphic if there 
is an invertible C™-function f : M —> M’ whose inverse is defined and is also C®. 

Note that our definition doesn’t assume the manifold M is embedded in some ambient 
space R”. Although it is true (Whitney) that any n-dimensional real manifold M can be 
embedded in Euclidean space R?”, this embedding may not be natural. For example, we 
are told that we live in a ‘curved’ four-dimensional manifold called space-time, but its 
embedding in R8 presumably has no physical significance. 

Much effort in differential geometry has been devoted to questions such as: Given some 
topological manifold M, how many inequivalent differential structures (compatible with 
the topological structure) can be placed on M? It turns out that for any topological 
manifold of dimension < 3, this differential structure exists and is unique. Moreover, 
R” has a unique differential structure as well in dimensions > 5. Remarkably, in four 
(and only four) dimensions it has uncountably many different differential structures (see 
[195])! Could this have anything to do with the appearance of macroscopic space-time 
being R*? Half a century before that discovery, the physicist Dirac prophesied [139]: 


...as time goes on it becomes increasingly evident that the rules which the math- 
ematician finds interesting are the same as those which Nature has chosen ... only 
four-dimensional space is of importance in physics, while spaces with other dimen- 
sions are of about equal interest in mathematics. It may well be, however, that this 
discrepancy is due to the incompleteness of present-day knowledge, and that future 
developments will show four-dimensional space to be of far greater mathematical 
interest than all the others. 


Given any open set U in a manifold M, write C°(U) for the space of C®-functions 
f :U > R.WhenU C U«, we can use local coordinates and write f(x 1 o.., x”) (local 
coordinates are often written with superscripts). A fundamental lesson of geometry (per- 
haps learned from physics) is that one studies the manifold M through the (local) smooth 
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Fig. 1.6 The tangent bundle of S!. 


functions f € C%°(U) that live on it. This approach to geometry has been axiomatised 
into the notion of sheaf (see e.g. [537]), to which we return in Section 5.4.2. 

For example, identifying S! with R/Z, the space C®(S') consists of the smooth 
period-1 functions f : R —> R, ie. f(0 + 1) = f(@). Or we can identify S! with the 
locus x? + y? = 1, in which case C®(S!) can be identified with the algebra C% (R?) of 
smooth functions in two variables, quotiented by the subalgebra (in fact ideal) consisting 
of all smooth functions g(x, y) vanishing on all points satisfying x? + y? = 1; when 
f(x, y), g(x, y) are polynomials, then they are identical functions in C~(S') iff their 
difference f(x, y) — g(x, y) is a polynomial multiple of x? + y? — 1. 

Fix a point p € M and an open set U containing p. In Section 1.4.2 we need the notion 
of tangent vectors to a manifold M. An intuitive approach starts from the set S(U, p) of 
curves passing through p, i.e. o : (—€, €) —> Uy is smooth and o(0) = p. Call curves 
01, 02 E€ S(U, p) equivalent if they touch each other at p, that is 


d d 
oi Spo iff ag FOO = 4 fOO), Vf ECU, p). (1.2.3a) 


This defines an equivalence relation; the equivalence class (ø), consisting of all curves 
equivalent to o is an infinitesimal curve at p. Equivalently, define a tangent vector to be 
a linear map € : C®(M) — R that satisfies the Leibniz rule 


ECD =E(f) ap) + Fp) Ele). (1.2.3b) 


In local coordinates € = aim 1 Oi a |x=p, Where the œ; are arbitrary real numbers. The 
bijection between these two definitions associates with any infinitesimal curve v = (c) , 
the tangent vector called the directional derivative D, : C°(M) —> R, given by 


D(f) = E Folia (1.2.3¢) 

The tangent space T,(M) at p is the set of all tangent vectors. Equation (1.2.3b) 

shows that T,,(M) has a natural vector space structure; its dimension equals that of M. 

These tangent spaces can be glued together into a 2n-dimensional manifold called the 

tangent bundle T M. Figure 1.6 shows why T S! is the cylinder S! x R. However, this 

is exceptional: although locally the tangent bundle T M of any manifold is trivial — that 

is, each TU, is diffeomorphic to the direct product Ua x R” — globally most tangent 
bundles T M are different from M x R”. 

A vector field is an assignment of a tangent vector to each point on the manifold, 

a smooth map X : M > TM such that X(p) € T,(M). Equivalently, we can regard it 
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Fig. 1.7 The flow of a vector field. 


as a derivation X :C~(M) > C™(M), i.e. a first-order differential operator acting on 
functions f : M — Rand obeying X(fg) = X(f) g + f X(g). For example, the vector 
fields on the circle consist of the operators KOF for any smooth period-1 function g (0). 

Let Vect(M) denote the set of all vector fields on a manifold M. Of course this is 
an infinite-dimensional vector space, but we see in Section 1.4.1 that it has a much 
richer algebraic structure: it is a Lie algebra. Vect(S!) is central to Moonshine, and in 
Section 3.1.2 we start exploring its properties. 

A vector field X on M can be interpreted as being the instantaneous velocity of a 
fluid confined to M. We can ‘integrate’ this, by solving a first-order ordinary differential 
equation, thus covering M with a family of non-intersecting curves. Each curve describes 
the motion, or flow, of a small particle dropped into the fluid at the given point p € M. 
The tangent vector to the curve at the given point p equals X(p) — see Figure 1.7. 
Equivalently, corresponding to a vector field X is a continuous family g, : M —> M of 
diffeomorphisms of M, one for each ‘time’ t, obeying pr o Ps = G45, Where ¢;(p) is 
defined to be the position on M where the point p flows to after t seconds. 

So it is natural to ask, what can we do with a diffeomorphism @ of M? Clearly, œ gives 
rise to an automorphism of the algebra C°(M), defined by f œ> f“ = f oa. Using 
this, we get an automorphism of Vect(M), X œ> X“, given by X°(f) = (XF , or 
more explicitly, X°(f)(p) = X(f o a)(a7!(p)). We return to this in Section 1.4.2. 

One thing you can do with a continuous family of diffeomorphisms is construct a 
derivative for the algebras C(M), Vect(M), etc. Defining a derivative of, say, a vector 
field X requires that we compare tangent vectors X(p), X(p’) at neighbouring points 
on the manifold. This can’t be done directly, since X(p) € T M and X(p’) € TyM lie 
in different spaces. Given a vector field X, and corresponding flow ¢,, define the Lie 
derivative £y(Y) € Vect(M) of any vector field Y € Vect(M) by 


yr —Y 
Lx (Np) = timo PTAR ery. 


The Lie derivative £y(f) of a function f e C%(M) is defined similarly, and equals 
X(f). 
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Dual to the tangent vectors are the differential I -forms. Just as the tangent spaces T,,(M ) 
together form the 2n-dimensional tangent bundle T M, so their duals T% (M ) form the 2n- 
dimensional cotangent bundle T* M . At least for finite-dimensional manifolds, the vector 
spaces Ts (M) and T,(M), as well as the manifolds T*M and T M, are homeomorphic, 
but without additional structure on M this homeomorphism is not canonical (it is basis- 
or coordinate-dependent). If x = (x!, ..., x”) > M is a coordinate chart for manifold 
M, then 0; := 4 |p is a basis for the tangent space T,,M, and its dual basis is written 
dx’ € T5 (M): by definition they obey dx'(0;) = 4;;. 

Changing local coordinates from x to y = y(x), the chain rule tells us 


0 "ax! a 
dy! dy’ Ox/ 
and hence the 1-form basis changes by the inverse formula: 
: "ay! . 
dy = Y 2 av. (1.2.4b) 
Ox! 


The main purpose of differential forms is integration (hence their notation). If we 
regard the integrand of a line-integral as a 1-form field (i.e. a choice of 1-form for 
each point p € M), we make manifest the choice of measure. Rather than saying the 
ambiguous ‘integrate the constant function “f (p) = 1” along the manifold S!’, we say 
the unambiguous ‘integrate the 1-form “wp, = d0” along the manifold SP. Likewise, 
the integrands of double-, triple-, etc. integrals are 2-forms, 3-forms, etc., dual to tensor 
products of tangent spaces. We can evaluate these integrals by introducing coordinate 
patches and thus reducing them to usual R” integrals over components of the differential 
form. The spirit of manifolds is to have a coordinate-free formalism; changing local 
coordinates (e.g. when moving from one coordinate patch to an overlapping one) changes 
those components as in (1.2.4b) in such a way that the value of the integral won’t change. 

A standard example of a 1-form field is the gradient df of a function f € C~(M), 
defined at each point p € M by the rule: given any tangent vector D, € T,(M), define 
the number (d f )(D,) to be the value of the directional derivative D,(f)(p) at p. 

A familiar example of a 2-form g, € T) (M) ® Ty (M) is the metric tensor on T,(M). 
Given two vectors u,v € T,, the number g,(u, v) is to be thought of as their inner- 
product. A Riemannian manifold is a manifold M together with a 2-form field g, which is 
symmetric and nondegenerate (usually positive-definite).> Given a local coordinate about 
p € M, a basis for the tangent space T,M is aoa and we can describe the metric tensor 
gp using an n x n matrix whose ij-entry is g;;(p) := 8o =), or in infinitesimal 
language as ds? = pe jai Si jdx'dx/, a form more familiar to most physicists. 


5 Whitney’s aforementioned embedding of M into Euclidean space implies that any manifold can be given a 
Riemannian structure, since a submanifold of a Riemannian manifold naturally inherits the Riemannian 
structure. The Beautiful Mind of John Nash proved that any Riemannian structure on a given n-dimensional 
manifold M can likewise be inherited from its embedding into some sufficiently large-dimensional 
Euclidean space. 
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Much structure comes with this metric tensor field g. Most important, of course, we 
can define lengths of curves and the angles with which they intersect. In particular, the 
arc-length of the curve y : [0, 1] — M is the integral 


| ; dy dy 
o VEE ae a 


a quantity independent of the specific parametrisation t > y(t) chosen (verify this). 

Also, we can use the metric to identify each T% with T,, just as the standard inner- 
product in R” permits us to identify a column vector u € R” with its transpose u' € R”*. 
Moreover, given any curve o : [0, 1] — M connecting o(0) = p to o(1) = q, we can 
identify the tangent spaces T,(M) and T,(M) by parallel-transport. Using this, we can 
define a derivative (the so-called ‘covariant derivative’) that respects the metric, and a 
notion of geodesic (a curve that parallel-transports its own tangent vector, and which 
plays the role of ‘straight line’ here). In short, on a Riemannian manifold geometry in 
its fullest sense is possible. See, for example, [104] for more details. 

Many manifolds locally look like a Cartesian product A x B. A fibre bundle p : E —> 
B locally (i.e. on small open sets U of E) looks like F x V, where F = p~!(b) (for 
any b € B) is called the fibre, and V is an open set in the base B. For example, the 
(open) cylinder and Möbius stip are both fibre bundles with base S! and fibre (0, 1) C R. 
A section s : B —> E obeys pos = id., that is for each small open set V of B itis a 
function V —> F. A vector bundle is a fibre bundle with fibre a vector space F = V, for 
example the tangent bundle T M is a vector bundle with base M and fibre = T,M. We 
write I (E) for the space of sections of a vector bundle E. A line bundle is a vector bundle 
with one-dimensional fibre (so the sections of a line bundle locally look like complex- or 
real-valued functions on the base). A connection on a vector bundle E — B is a way to 
differentiate sections (the covariant derivative). An example is a Riemannian structure 
on the tangent bundle E = T B. See, for example, [104] for details and examples. 

Felix Klein’s Erlangen Programm (so called because he announced it there) is a 
strategy relating groups and geometry. Geometry, it says, consists of a manifold (the space 
of points) and a group of automorphisms (transformations) of the manifold preserving 
the relevant geometric structures (e.g. length, angle, lines, etc.). Conversely, given a 
manifold and a group of automorphisms, we should determine the invariants relative to 
the group. Several different geometries are possible on the same manifold, distinguished 
by their preferred transformations. 

For example, Euclidean geometry in its strongest sense (i.e. with lengths, angles, lines, 
etc.) has the group of symmetries generated by rotations, reflections and translations — 
that is any transformation of the form x œ> xA + a, where x, a € R” (regarded as row 
vectors, say) and A is anorthogonaln x n matrix. If our context is scale-independent (e.g. 
when studying congruent triangles), we can allow A to obey AA‘ = AJ for any à € R. 

More interesting is projective geometry. Here, angles and lengths are no longer invari- 
ants, but lines are. Projective geometry arose from the theory of perspective in art. The 
transformations of projective n-geometry come from projections R’t! > R”. 
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More precisely, consider real projective n-space P” (R). We coordinatise it using homo- 
geneous coordinates: P"(R) = R"t!'/ ~ consists of (n + 1)-tuples of real numbers, 
where we identify points with their multiples. The origin (0,0, 0,...,0) in (n + 1)- 
space is excluded from projective space (hence the prime), as it belongs to all such lines. 
A projective ‘point’ consists of points on the same line through the origin; a projective 
‘line’ consists of planes through the origin; etc. By convention, any equation in homo- 
geneous coordinates is required to be homogeneous (so that a point satisfies an equation 
iff its whole line does). Complex projective space P”(C) is defined similarly. 

To see what projective geometry is like, consider first the projective line P'(R). Take 
any point in (x, y) € P!(R). If y 40 we may divide by it, and we get points of the 
form (x’, 1). These are in one-to-one correspondence with the points in the real line. 
If, on the other hand, y = 0, then we know x ¥ 0 and so we should divide by x: what 
we get is the point (1, 0), which we can think of as the infinite point G, 1). Thus the 
real projective line P'(R) consists of the real line, together with a point ‘at infinity’. 
Similarly, the complex projective line consists of the complex plane C together with a 
point at infinity; topologically, this is a sphere named after Riemann. 

More generally, P”(R) consists of the real space R”, together with a copy of P”! (R) 
as the hyperplane of infinite points. These points at infinity are where parallel lines meet. 
Intuitively, projective geometry allows us to put ‘finite’ and ‘infinite’ points on an equal 
footing; we can see explicitly how, for example, curves look at infinity. 

For example, the ‘parallel’ lines x = 0 and x = 1 in P?(R) correspond to the homo- 
geneous equations x = 0 and x = z, and so to the points with homogeneous coordinates 
(0, y,z) and (x, y,x). They intersect at the ‘infinite’ point (0, y, 0) ~ (0, 1, 0). The 
parabola y = x” has only one infinite point (namely (0,1,0)), the hyperbola xy = 1 has 
two infinite points ((1,0,0) and (0,1,0)), while the circle x? + y? = | doesn’t have any. 
Intuitively, the parabola is an ellipse tangent to the line (really, circle) at infinity, while 
the hyperbola is an ellipse intersecting it transversely. 

Klein’s group of transformations here is the projective linear group PGL,,, ; (R), that is 
all invertible (n + 1) x (n + 1) matrices A where we identify A with ÀA for any nonzero 
number A. It acts on the homogeneous coordinates in the usual way: x +» xA. This 
group mixes thoroughly the so-called infinite points with the finite ones, and emphasises 
that infinite points in projective geometry are completely on a par with finite ones. 


0 0 1 
For example, the transformation A= | 0 1 0 | maps the parabola y = x? to the 
1 0 0 


hyperbola xy = 1, indicating that these are projectively identical curves. 

Projective geometry is central to modern geometry. The projective plane can be 
axiomatised, for example one axiom says that any two lines intersect in exactly one 
point. A remarkable property of projective geometry is that any theorem remains a 
theorem if the words ‘line’ and ‘point’ are interchanged. 

In summary, there are many different geometries. Which geometry to use (e.g. 
Euclidean, projective, conformal) in a given context depends on the largest possible 
group of transformations that respect the basic quantities. 
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Fig. 1.8 Two homotopic loops in mı = F4. 


1.2.3 Loops 


The last subsection used curves to probe the infinitesimal neighbourhood of any point 
p € M. We can also use curves to probe global features of manifolds. 

Let M be any manifold, and put J = [0, 1]. A loop at p € M is any continuous curve 
o : I — Mwitho(0) = o(1) = p. Soo starts and ends at the point p. Let Q(M, p) be 
the set of all such loops. Loops oo, o1 E€ Q(M, p) are homotopic if oo can be continuously 
deformed into oj, that is if there is a continuous map F : I x I > M with oj(«) := 
F(x,i) € QCM, p), for i = 0, 1. This defines an equivalence relation on Q(M, p). For 
instance all loops in M = R” are homotopic, while the homotopy equivalence classes 
for the circle M = S! are parametrised by their winding number n € Z, that is by the 
contour integral + Fan E, 

Let xı(M, p) denote the set of all homotopy equivalence classes for Q(M, p). It has 
a natural group structure: oo’ is the curve that first goes from p to p following o, and 
then from p to p following o’. More precisely, 


1 
2 

: 1.2.5 
: (1.2.5) 


ot) if0<r< 
(oo')(t) = | i 
z 


o'Qt—1) if 


For instance, the inverse o~! is given by the curve traversed in the opposite direction: 
t œ> o (l1 — t). The identity is the constant curve o (t) = p. With this operation 7,(M, p) 
is called the fundamental group of M (the subscript ‘1’ reminds us that a loop is a map 
from S!; likewise 2, considers maps from the k-sphere S% to M). As long as any two 
points in M can be connected with a path, then all 7,(M, p) will be isomorphic and we 
can drop the dependence on ‘p’. When mı(M) = {e}, we say M is simply connected. 

For example, mı (R”) = 1 andz,(S!) = Z. The complex plane C with n points removed 
has fundamental group 7\(C\{z1, ..., Zn}) = Fn, the free group — Figure 1.8 gives two 
paths homotopic to Kats € F4. The torus S! x S! has m = Z@Z. 

The braid group (1.1.9), as with any group, also has a realisation as a fundamental 
group. Let €, be C” with all diagonals removed: 


En = {(Z1; -<-> Zn) € C” |z; Æ zj whenever i Æ j}. (1.2.6) 


Then it is easy to see that the pure braid group P, is isomorphic to 7r(€,,) — indeed, 
given any braid œ € B,, the value of the ith coordinate o(t); of the corresponding loop 
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Fig. 1.9 Some trivial knots. 


9) 


Fig. 1.10 The trefoil. 


Fig. 1.11 A wild knot. 


o € 7 (€,,) will be the position of the ith strand when we take a slice at t through our braid 
(t = Ois the top of the braid, t = 1 the bottom). Now, the symmetric group S,, acts freely 
(i.e. without fixed points) on €, by permuting the coordinates: 7.z = (Zx1, . --, Zn). The 
space €,,/S, of orbits under this action has fundamental group mı (6n/Sn) = Bn. 

Note thatif f : M’ —> M isahomeomorphism, then it induces a group homomorphism 
fx : 01(M') > mı(M). We return to this in Section 1.7.2. 

By a link we mean a diffeomorphic image of Sı U--- U S! into RÌ. A knot is a link 
with one strand — see Figures 1.9 and 1.10. Since S! comes with an orientation, so does 
each strand of a link. The reason for requiring the embedding f : S'U---US! > R? 
to be differentiable is that we want to avoid ‘wild knots’ (see Figure 1.11); almost every 
homeomorphic image of S! U - -- U S! will be wild at almost every point. 

Two links are equivalent, i.e. ambient isotopic, if continuously deforming one link 
yields the other. The word ‘ambient’ is used because the isotopy is applied to the ambient 
space R°. This is the intuitive notion of equivalent knots in a string, except that we glue 
the two ends of the string together (we can trivially untie any knotted open string by 
slipping the knot off an end). By a trivial knot or the unknot we mean any knot homotopic 
to (say) the unit circle in the xy-plane in R°. 

We choose R? for the ambient space because any link in R”, for n > 4, is trivial, 
and the Jordan Curve Theorem tells us that there are only two different ‘knots’ in R? 
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cepa = |e + Pete, 


Fig. 1.12 The Reidemeister moves I, II, III, respectively. 


X KX) 


C- C+ Co 
Fig. 1.13 The possible (non)crossings. 


(distinguished by their orientation). More generally, knotted k-spheres S* in R” are 
nontrivial only when n = k + 2 [478]. 

It isn’t difficult to show [478] that two links are ambient isotopic iff their diagrams can 
be related by making a finite sequence of moves of the form given in Figure 1.12. The 
Reidemeister moves are useless at deciding directly whether two knots are equivalent, or 
even whether a given knot is trivial. Indeed, this seems difficult no matter which method 
is used, although a finite algorithm (by Haken and Hemion [283]) apparently exists. A 
very fruitful approach has been to assign to a link a quantity (called a link invariant), 
usually a polynomial, in such a way that ambient isotopic links get the same quantity. 
One of these is the Jones polynomial J}, which can be defined recursively by a skein 
relation. Start with any (oriented) link diagram and choose any crossing; up to a rotation 
it will either look like the crossing C, or C_ in Figure 1.13. There are two things we 
can do to this crossing: we can pass the strings through each other (so the crossing of 
type C+ becomes one of type C+); or we can erase the crossing as in Co. In this way we 
obtain three links: the original one (which we could call L+ depending on the orientation 
of the chosen crossing) and the two modified ones (L+ and Lo). The skein relation is 


t Je, (© —t Je) + (t? — t?) Ja (t) = 0. (1.2.7) 


We also define the polynomial J (t) of the unknot to be identically 1. 

For a link with an odd number of components, Jz (t) € Z[t*!] is a Laurent polynomial 
in t, while for an even number J; (t) € /fZ[t*!]. For example, applying (1.2.7) twice, 
we get that the Jones polynomial of the trefoil in Figure 1.10 is J(t) = —tt + ° +1. 

Are the trefoil and its mirror image ambient isotopic? The easiest argument uses the 
Jones polynomial: taking the mirror image corresponds to replacing t with r~', and we 
see that the Jones polynomial of the trefoil is not invariant under this transformation.° 


6 More generally, a knot with odd crossing number will be inequivalent with its mirror image (the crossing 
number is the minimum number of crossings needed in a diagram of the knot). 
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Fig. 1.14 The link associated with a braid. 


Fig. 1.15 A Markov move of type IL. 


The Reidemeister moves quickly prove Jz (t) is a knot invariant, i.e. equivalent knots 
have the same polynomial, although inequivalent knots can also have the same one. 
But it was the first new knot polynomial in 56 years. It triggered discoveries of several 
other invariants while making unexpected connections elsewhere (Section 6.2.6), and 
secured for Jones a Fields medal. The problem then became that there were too many 
link invariants. We explain how we now organise them in Section 1.6.2. 

Braids and links are directly related by theorems of Alexander (1923) and Markov 
(1935). Given any braid œ we can define a link by connecting the ith spot on the bottom of 
the braid with the ith spot on the top, as in Figure 1.14. Alexander’s theorem tells us that 
all links come from a braid in this way. Certainly though, different braids can correspond 
to the same link — for example, take any a, 8 € B,, then the links of œ and BaB™! are 
the same (slide the braid 87! counterclockwise around the link until it is directly above, 
and hence cancels, 6). This is called a Markov move of type I. A Markov move of type 
II changes the number of strands in a braid by +1, in a simple way — see Figure 1.15. 
Markov’s theorem [59] says that two braids a € B,, 6 € Bm correspond to equivalent 
links iff they are related by a finite sequence of Markov’ moves. In Section 6.2.5, we 
explain how to use these two theorems to construct link invariants. 


Question 1.2.1. Come up with a reasonable definition for the automorphism group of a 
lattice. Prove that the automorphism group of a positive-definite lattice is always finite. 


7 His father is the Markov of Markov chains. 
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Question 1.2.2. Let x = (x1, x2) be any vector with nonzero coordinate x2. Write L(x) 
here for the lattice Z(1,0)+Zx, and T(x) for the torus R?/L(x). Which x’s give 
pointwise identical lattices (i.e. given x, find all y such that L(x) = L(y))? Verify that 
all tori are diffeomorphic. Which tori T (x) are obviously conformally equivalent? 


Question 1.2.3. If we drop the requirement in Definition 1.2.1 that the x be a basis, 
does anything really bad happen? 


Question 1.2.4. Prove Theorem 1.2.2. 


Question 1.2.5. Let L be an integral lattice. What is special about the reflection ry 
through a vector æ € L with norm-squared a - œ = 2? (The formula for the reflection ry 


iSrg(x) = x — wa a.) 


Question 1.2.6. Prove from (1.2.7) that the Jones polynomial for a link and its mirror 
image can be obtained from each other by the switch t <> ¢~!. Prove that the Jones 
polynomial of a link is unchanged if the orientation of any component (i.e. the arrow on 
any strand) is reversed. 


Question 1.2.7. Find the Jones polynomial of the disjoint union of n circles. 


1.3 Elementary functional analysis 


Moonshine concerns the occurrence of modular forms in algebra and physics, and care is 
taken to avoid analytic complications as much as possible. But spaces here are unavoid- 
ably infinite-dimensional, and through this arise subtle but significant points of con- 
tact with analysis. For example, the q!⁄^ prefactor in the Dedekind eta (2.2.6b), and 
the central extension of loop algebras (3.2.2a), are analytic fingerprints. Lie group 
representations usually involve functional analysis (see e.g. Section 2.4.2 where we 
relate the Heisenberg group to theta functions). Much of functional analysis was devel- 
oped to address mathematical concerns in quantum theory, and perhaps all of the rich 
subtleties of quantum field theory can be interpreted as functional analytic technicali- 
ties. For example, anomalies (which for instance permit derivations of the Atiyah—Singer 
Index Theorem from super Yang—Mills calculations) can be explained through a careful 
study of domains of operators [172]. Moreover, the natural culmination of the Jones 
knot polynomial is a deep relation between subfactors and conformal field theories 
(Section 6.2.6). The necessary background for all this is supplied in this section. 

In any mature science such as mathematics, the division into branches is a convenient 
lie. In this spirit, analysis can be distinguished from, say, algebra by the central role played 
in the former by numerical inequalities. For instance, inequalities appear in the definition 
of derivatives and integrals as limits. Functional analysis begins with the reinterpretation 
of derivatives and integrals as linear operators on vector spaces. These spaces, which 
consist of appropriately restricted functions, are infinite-dimensional. The complexity 
and richness of the theory comes from this infinite-dimensionality. 
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Section 1.3.1 assumes familiarity with elementary point-set topology, as well as the 
definition of Lebesgue measure. All the necessary background is contained in standard 
textbooks such as [481]. 


1.3.1 Hilbert spaces 


By a vector space V, we mean something closed under finite linear combinations 
>, av. Here we are primarily interested in infinite-dimensional spaces over the 
complex numbers (i.e. the scalars a; are taken from C), and the vectors v are typi- 
cally functions f. By a (complex) pre-Hilbert space we mean a vector space V with a 
Hermitian form (f, g) € C (‘Hermitian form’ is defined in Section 1.1.3). All complex 
n-dimensional pre-Hilbert spaces are isomorphic to C” with Hermitian form 


(u, v) = Urvi +--+ + UAV). 


The analogue of C” in countably many dimensions is £?(00), which consists of all 
sequences u = (u1, U2, ...) with finite sum ae |ui 7 < oo. The reader can verify that 
it is closed under sums and thus forms a pre-Hilbert space. Another example consists of 
the C%-functions f : R” — C, say, with ‘compact support’ (that means that the set of 
all x € R” for which f(x) 4 0 is bounded). The Hermitian form here is 


Pasi -f fx) g(x) d"x ; (1.3.1) 


this pre-Hilbert space is denoted CX (R”). For instance, the function defined by 


Fer exp[=4] for —l<x<1l 
E 0 otherwise 
lies in COR). A larger space, arising for instance in quantum mechanics, is denoted 


S(R") and consists of all functions f € C%(R") that, together with their derivatives, 
decrease to 0 faster than any power of |x|~!, as |x| —> 00. The space S is a pre-Hilbert 
space, again using (1.3.1). It contains functions such as poly(x1, ..., Xn) i 

A pre-Hilbert space has a notion of distance, or norm || f ||, given by || f ||? = (f, f). 
Using this we can define limits, Cauchy sequences, etc. in the usual way [481]. We call 
a subset X of V dense in V if for any f € V there is a sequence f, € X that converges 
to f. For instance, the rationals Q are dense in the reals R, but the integers aren’t. Any 
convergent sequence is automatically Cauchy; a pre-Hilbert space V is called complete 
if conversely all Cauchy sequences in it converge. 


Definition 1.3.1 A Hilbert space H is a complete pre-Hilbert space. 


For example, each C” is Hilbert, as is £7(00). Most pre-Hilbert spaces aren’t Hilbert, for 
example neither C°°(IR") nor S(R”) are. However, given any pre-Hilbert space V, there 
is a Hilbert space H that contains V as a dense subspace. This Hilbert space 7 is called 
the completion V of V, and is unique up to isomorphism. The construction of H from 
Y is analogous to the construction of R from Q, obtained by defining an equivalence 
relation on the Cauchy sequences. 
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The Hilbert space completion C°°(IR”) = S(R") is defined using the ‘Lebesgue mea- 
sure’ u, which is an extension of the usual notion of length to a much more general class 
of subsets X C R than the intervals, and the ‘Lebesgue integral’ f f(x) du(x), which is 
an extension of the usual Riemann integral to a much more general class of functions than 
the piecewise continuous ones. For example, what is the length of the set X consisting 
of all rational numbers between 0 and 1? This isn’t defined, but its Lebesgue measure is 
easily seen to be 0. We won’t define Lebesgue measures and integrals here, because we 
don’t really need them; a standard account is [481]. The completion of C°°(IR”) is the 
Hilbert space L?(R”) consisting of all square-integrable functions f : R” > CU {oo}. 
The Hermitian form is given by (f, g) = Jie fœ) g(x) du(x). By f ‘square-integrable’ 
we mean that f is ‘measurable’ (e.g. any piecewise continuous function is measurable) 
and (f, f) < oo. We must identify two functions f, g if they agree almost everywhere, 
that is the set X of all x € R” at which f(x) ¥ g(x) has Lebesgue measure 0. This is 
because any two such functions have the property that (f, h) = (g, h) for all h. 

All Hilbert spaces we will consider, such as L?(R"), are separable. This means that 
there is a countable orthonormal set X of vectors e, € H (so len, €m} = Sym) such that the 
pre-Hilbert space span(X ) consisting of all finite linear combinations ` amem is dense in 
H. That is, given any f € H, f = lim, X}; (ei, f) ei — we say that the topological 
span of X is H. All infinite-dimensional separable Hilbert spaces are isomorphic to 
€?(0o). The easy proof sends f € H to the sequence ((e1, f), (e2, f),...) € (00). 

We are interested in linear maps. The first surprise is that continuity is not automatic. In 
fact, let T : Vj — V be a linear map between pre-Hilbert spaces. Then T is continuous 
at one point iff it’s continuous at all points, iff it is bounded — that is, iff there exists a 
constant C such that ||Tf || < C || f ||, for all f € Vi. If V; is finite-dimensional, then it 
is easy to show that any linear T is bounded. But in quantum mechanics, for example, 
most operators of interest are unbounded. 

Another complication of infinite-dimensionality is that in practise we’re often inter- 
ested in linear operators whose domain is only a (dense) subspace of H. For exam- 
ple, the domains of the operators f(x) œ> xf (x) or f(x) => 4 f(x) (the ‘position’ and 
‘momentum’ operators of quantum mechanics — see Section 4.2.1) are proper subspaces 
of L?(R). Those operators are well-defined though on S(R) (indeed, this is precisely 
why the space S is so natural for quantum mechanics). Once again bounded operators 
are simpler: if T is a bounded linear operator on some dense subspace V of a Hilbert 
space H, then there is one and only one way to continuously extend the domain of T to 
all of H. 

The dual (or adjoint) V* of a pre-Hilbert space V is defined as the space of all 
continuous linear maps (functionals) V —> C. In general, V can be regarded as a subspace 
of V*, with f € V being identified with the functional g +> (f, g); when V is a Hilbert 
space H, this identification defines an isomorphism H* = H. 

The functionals for C% are called distributions, while those for S are tempered dis- 
tributions. For example, the Dirac delta ‘6(x — a)’ is defined as the element of S(R)* 
sending functions g € S(R) to the number g(a) € C. (Tempered) distributions F can all 
be realised (non-uniquely) as follows: given a € N and a continuous function f(x) of 
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polynomial growth, we get a functional F € S(R)* by 


F(ọ)= TY j 1.3.2 
w= rot x. (1.3.2) 


A similar realisation holds for the spaces S(R”) and C% (R”). Of course distributions are 
not functions, and we cannot rewrite (1.3.2) as f g(x) g(x) d”x for some function g. Note 
that the Dirac delta is not well-defined on the completion L? (R) of S, since the elements 
f € L*(R) are equivalence classes of functions and hence have ambiguous function 
values f(a). This beautiful interpretation of distributions like 6 as linear functionals 
is due to Sobolev and was developed by Schwartz, the 1950 Fields medalist. Another 
interpretation, using formal power series, is given in Section 5.1.2. 

Distributions can be differentiated arbitrary numbers of times, and their partial deriva- 
tives commute (something not true of all differentiable functions). However, they usually 
cannot be multiplied together and thus form only a vector space, not an algebra. For more 
on distributions, see chapter 2 of [67] or chapter I of [244]. 

We’re most interested in unitary and self-adjoint operators. First, let’s define the 
adjoint. Let T : VY — H be linear, where V is a subspace of H. Let U be the set 
of all g € H for which there is a unique vector g* € H such that for all f € V, 
(g*, f) = (g, Tf). Define the map (adjoint) T* : U —> H by T*(g) = g*. The adjoint 
T* exists (i.e. its domain U is non-empty) iff V is dense in H. In particular, T** need 
not equal T. When V is dense in H, U is a vector space and T* is linear. When T is 
bounded, so is T*, and its domain Z/ is all of H. Note that (g, Tf) = (T*g, f) for all 
f €V.g €U, but that relation doesn’t uniquely specify T*. 

We call T self-adjoint if T = T* (so in particular this implies that their domains V, U 
are equal). This implies (T f, g) = (f, Tg), but as before the converse can fail. If T is 
self-adjoint and unbounded, then its domain cannot be all of H. 

A linear map T : Hı —> H2 between Hilbert spaces H1, H2 is unitary if it is both onto 
and obeys (T f, Tg) = (f, g). Equivalently, T*T = TT* = 1. The surjectivity assump- 
tion is not redundant in infinite dimensions (Question 1.3.2). A unitary map is necessarily 
bounded. A famous example of a unitary operator is the Fourier transform f > F, which, 
as usually defined, maps S(R”) onto itself; it extends to a unitary operator on L?(R”). 

To define limits, etc., one needs only a topology. This need not come from a norm, 
and in general many different topologies can naturally be placed on a space. For an 
artificial example, consider the real line R endowed with the discrete topology (in 
which any subset of R is open): then any function f : R —> R will be continuous, a 
sequence x, € R will converge iff there is some N such that xy = Xy+41 = XN42 =-°°, 
and R with this topology is again complete. In the topology coming from the Hermitian 
form (1.3.1), S(R”) is incomplete, however it is common to refine that topology some- 
what. In this new topology, a sequence fm € S(R) converges to 0 iff for every a, b € N 
we have 


=0. 


. b 
limMmn—coSUPyep |X| 


d fin (x) 
dxs 
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This topology comes from interpreting S as the intersection of countably many Hilbert 
spaces; with it, S is complete. When we speak of S(R”) elsewhere in this book, we 
always take its topology to be this one (or its higher-dimensional analogue). Similar 
comments can be made for C2°(IR") — see chapter I of [244] for details. With these new 
topologies, both S(R”) and CX (R”) are examples of nuclear spaces;® although they are 
not themselves Hilbert spaces (the completeness in Definition 1.3.1 must be in terms of 
the norm topology), they behave in a more finite-dimensional way, as is indicated by the 


Spectral Theorem given below. See, for example, [244] for more on nuclear spaces. 

The Spectral Theorem tells us in which sense we can diagonalise self-adjoint and 
unitary operators. To state it precisely, we need a small generalisation of the construction 
of £7(00). Consider any measure space (e.g. X = R or S! with Lebesgue measure ju). Fix 
n=1,2,...,00, and suppose that foreach x € X there is associated a copy Hn of C” or 
(ifn = œœ) £7(00). We want to define the (orthogonal) direct integral over x € X of these 
H,,’s. Consider all functions h : X —> Hn, x > h, that aren’t too wild and that obey the 
finiteness condition f y lx |? du < œ. As usual, we identify two such functions h, g if 
they agree everywhere except on a subset of X of u-measure 0. Defining a Hermitian 
form by (h, g) = f y (Ax, 8x) du, the set of all such (equivalence classes of) h constitutes 
a Hilbert space denoted f yx Hn du (completeness is proved as for £? (00)). It is trivial to 
drop the requirement that the separable space H, be fixed — see, for example, chapter 2 
of [67] for details of the direct integral f y H(x) du. 

In finite dimensions any self-adjoint operator is diagonalisable. This fails in infinite 
dimensions, for example both the ‘momentum operator’ iZ and the ‘position operator’ 
f(x) x f(x) are self-adjoint on the dense subspace S(R) of L?(R), but neither have 
any eigenvectors anywhere in L7(R). So we need to generalise eigen-theory. 

The statement of the Spectral Theorem simplifies when our operators act on S. So 
let T : S(R") > S(R") be linear. Diagonalising T would mean finding a basis for S 
consisting of eigenvectors of T. We can’t do that, but we get something almost as good. 
By a generalised eigenvector corresponding to the generalised eigenvalue à € C, we 
mean a tempered distribution F € S* such that F(T g) = A F (ọ) forall g € S. For each 
A, let E, C S* be the generalised eigenspace consisting of all such F. We say that the set 
of all generalised eigenvectors U, E, is complete if they distinguish all vectors in S, i.e. 
if, for any Q, gy’ € S, we have F(y) = F(q’) for all generalised eigenvectors F € ULE, 
iffg = g'. 


Theorem 1.3.2 (Spectral Theorem) 

(a) Let U : S(R") > S(R") be unitary. Then U extends uniquely to a unitary operator 
on all of L?(IR"). All generalised eigenvalues À lie on the unit circle \4| = 1. We can 
express L?(R”) as a direct integral Sazi H(A) dua) of Hilbert spaces H(A) C Ej, so 


8 Nuclear spaces were first formulated by Grothendieck, who began his mathematical life as a functional 
analyst before revolutionising algebraic geometry. The term ‘nuclear’ comes from ‘noyau’ (French for both 
‘nucleus’ and ‘kernel’), since the Kernel Theorem is a fundamental result holding for them. The ‘L’ in both 
£? and L? is in honour of Lebesgue, and the symbol S honours Schwartz. 
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that U sends the function h € L?(R") to the function Uh with 4-component (U h), = 
Ah, € H(A). Moreover, the generalised eigenvectors are complete. 

(b) Suppose A : S(R") > S(R") is self-adjoint. Then all generalised eigenvalues à lie 
on the real line R. We can express L?(R") as a direct integral ie H(A) du(A) of Hilbert 
spaces H(A) C Ej, so that for each h € S(R"), Ah has 4-component (Ah), = hj. 
Moreover, the generalised eigenvectors are complete. 


For a simple example, consider the linear map U : L?(R) > L?(R) acting by trans- 
lation: (Uf Xx) = f(x + 1). This is unitary, but it has no true eigenvectors in L?. On 
the other hand, each point à = e?” on the unit circle is a generalised eigenvalue, corre- 
sponding to generalised eigenvector F, given by F, (g) = JS e™d* g(x) dx. The direct 
integral interpretation of L? corresponds to the association of any f(x) € L? with its 
Fourier transform f, = Fo) = i aes e')* f(x) dx. The completeness of the generalised 
eigenvectors is implied by the Plancherel identity 


1 oa. 
f OPd = = / FoPay. (1.3.3) 
IU 


The Spectral Theorem as formulated also holds for C% in place of S, and more gener- 
ally for any rigged (or equipped) Hilbert space V C H C V*, where H is separable and 
Y is nuclear (chapter I of [244]). They help provide a mathematically elegant formulation 
of quantum theories. 


1.3.2 Factors 


von Neumann algebras (see e.g. [319], [177]) can be thought of as symmetries of a 
(generally infinite) group. Their building blocks are called factors. Vaughn Jones initiated 
the combinatorial study of subfactors N of M (i.e. inclusions N C M where M, N are 
factors), relating it to, for example, knots, and for this won a Fields medal in 1990. In 
Section 6.2.6 we describe Jones’s work and the subsequent developments; this subsection 
provides the necessary background. Our emphasis is on accessibility. 

Let H be a (separable complex) Hilbert space. By L(H) we mean the algebra of 
all bounded operators on H (we write ‘1’ for the identity). For example, £(C”) is the 
space M,,(C) of all n x n complex matrices. Let ‘x’ be the adjoint (defined in the last 
subsection). Given a set $ of bounded operators, denote by S’ its commutant, that is the 
set of all bounded operators x € £(H) that commute with all y € S: xy = yx. We write 
S” := (S'Y for the commutant of the commutant — clearly, $ C S$”. 


Definition 1.3.3 A von Neumann algebra M is a subalgebra of L(H) containing the 
identity 1, which obeys M = M* and M = M”. 


This is like defining a group by a representation. A von Neumann algebra can also be 
defined abstractly, which is equivalent except that (as we will see shortly) the natural 
notions of isomorphism are different in the concrete and abstract settings (just as the 
same group can have non-isomorphic representations). 
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Of course L(H) is a von Neumann algebra. Given any subset $ C L(H) with S* = S, 
the double-commutant $” is a von Neumann algebra, namely the smallest one containing 
S. The space L°(R) of bounded functions f : R — C forms an abelian von Neumann 
algebra on the Hilbert space H = L?(R) by pointwise multiplication. More generally 
(replacing R with any other measure space X and allowing multiple copies of the Hilbert 
space L?(X)), all abelian von Neumann algebras are of that form. 

The centre Z(M) = M N M’ of avon Neumann algebra M is an abelian one. Using the 
above characterisation Z(M) = L™(X), we can write M as a direct integral f, y MA) da 
of von Neumann algebras M(A) with trivial centre: Z(M(A)) = C1. The direct integral, 
discussed last subsection, is a continuous analogue of direct sum. 


Definition 1.3.4 A factor M is a von Neumann algebra with centre Z(M) = CI. 


Thus the study of von Neumann algebras is reduced to that of factors — the simple 
building blocks of any von Neumann algebra. £(7) is a factor. In finite dimensions, 
any (concrete) factor is of the form M,,(C) ® C/,, acting in the Hilbert space C” @ C” 
(‘Im is the m x m identity matrix). Whenever the factor is (abstract) isomorphic to some 
L(A), its concrete realisation will have a similar tensor product structure, which is the 
source of the name ‘factor’. In quantum field theory, where von Neumann algebras arise 
as algebras of operators (Section 4.2.4), a factor means there is no observable that can 
be measured simultaneously (with infinite precision) with all others. 

The richness of the theory is because there are other factors besides L(H). In particular, 
factors fall into different families: 


Type I,,: the factors (abstract) isomorphic to L(H) (n = dim H). 

Type II: infinite-dimensional but it has a trace (i.e. a linear functional tr: M — C such 
that tr(xy) = tr(yx)). 

Type H: the factors isomorphic to II, ® L(H). 

Type III: everything else. 


Choosing the normalisation tr(1) = 1, the type II, trace will be unique. This is a very 
coarse-grained breakdown, and in fact the complete classification of factors is not known. 
There are uncountably many inequivalent type II, factors. Type III is further subdi- 
vided into families HI, for all 0 < A < 1. von Neumann regarded the type III factors 
as pathological, but this was unfair (see Section 6.2.6). Almost every factor is isomor- 
phic to type III, (i.e. perturbing an infinite-dimensional factor typically gives you one 
of type I). Hyperfinite factors are limits in some sense of finite-dimensional factors. 
There is a unique (abstract) hyperfinite factor of type I, Iæ and HI, for O < à < 1; 
we are interested in the hyperfinite II, and III, factors. Incidentally, the von Neumann 
algebras arising in quantum field theory are always of type M4. 

Discrete groups impinge on the theory through the crossed-product construction of 
factors. Start with any von Neumann algebra M C CL(H), and let G be a discrete group 
acting on M (so g.(xy) = (g.x)(g.y) and g.x* = (g.x)*). Let Hg = H & £7(G) be the 
Hilbert space consisting of all column vectors ¢ = (¢g)geG with entries ¢, € H and 
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obeying D Zell? < oo. M acts on Hg by ¢ > 2(x)(f), where z is defined by 


NE) = (71x )tg. (1.3.4a) 


In (1.3.4a), g7 !.x is the action of G on M, and g~!.x € M C L(H) acts on fo € H by 
definition. Likewise, G acts on Hg by ¢ > A(h)(¢), where A is defined by 


AANE) g = Enig. (1.3.4b) 


We can regard m and A as embedding M and G in L(Hg). The crossed-product is simply 
the smallest von Neumann algebra containing both these images: 


M xG := (a (M) U A(G)". (1.3.4c) 


More explicitly (using the obvious orthonormal basis), any bounded operator ý € L(Hg) 
is a matrix } = (n) with entries j,, E€ L(H) for g,h € G, and where (jf), = 
J neg Jeh Sn (defining the infinite sum on the right appropriately [319]). Then for all 
x € M and g, h,k € G, we get the matrix entries 


T(X)g,n = Sgn h7., 
Mk)g.h =Se.kn 1. 


The crossed-product is now a space of functions y : G > M: 


M™xG = {y : G > M |3ğ € L(He) such that Fg n = h`! (en) Yg, heG} 


(1.3.4d) 
(see lemma 1.3.1 of [319]). In this notation the algebra structure of M xG is given by 
YNE) = X h'en- Yn), (1.3.4e) 
heG 
Og = 8g). (1.3.4f) 


Crossed-products allow for elegant constructions of factors. For example, the (von 
Neumann) group algebra CG is type II, for any discrete group G acting trivially on 
C and with the property that all of its conjugacy classes (apart from {e}) are infinite 
(examples of such G are the free groups F, or PSL2(Z)). Also, any type II, factor is of 
the form M xR, where M is type II% and the R action scales the trace. 

A proper treatment of factors (which this subsection is not) would involve projections 
onto closed subspaces, that is elements p € M satisfying p = p* = p*. These span (in 
the appropriate sense) the full von Neumann algebra. In the case of M = M,,(C), the 
projections are precisely the orthogonal projections onto subspaces of C”, and thus have a 
well-defined dimension (namely the dimension of that subspace, so some integer between 
0 and n). Remarkably, the same applies to any projection in any factor. For type II, this 
‘dimension’ dim(p) is the trace t(p), which we can normalise so that t(1) = 1. Then 
we get that dim( p) continuously fills out the interval [0, 1]. For type Io, the dimensions 
fill out [0, oo]. For type III, every nonzero projection is equivalent (in a certain sense) to 
the identity and so the (normalised) dimensions are either 0 or 1. 
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Finally, one can ask for the relation between the abstract and concrete definitions of 
M —in other words, given a factor M , what are the different representations (= modules) 
of M, that is realisations of M as bounded operators on a Hilbert space H. For example, 
for M of type I,, these are of the form M @ C™ = M @---@ M (m times) for m finite, 
as well as M @ £°. We see the type I,, modules are in one-to-one correspondence with 
the ‘multiplicity’ m € {0,1,..., co}, which we can denote dimy(H) and think of as 
dim(H)/dim(M), at least when M is finite-dimensional. There is a similar result for 
type II: for each choice d € [0, co] there is a unique module H4, and any module H is 
equivalent to a unique H4. Finally, any two nontrivial representations of a type III factor 
will be equivalent. For a general definition of dimy (H) and a proof of this representation 
theory, see theorem 2.1.6 in [319]. 

For type Il), this parameter d =: dimy(H) is sometimes called by von Neumann’s 
unenlightening name ‘coupling constant’. Incidentally, Hı is constructed in Ques- 
tion 1.3.6. 


Question 1.3.1. (a) Verify explicitly that the position f(x) => xf (x) and momentum re 


dx 
operators are neither bounded nor continuous, for the Hilbert space L*(R). 
(b) Verify explicitly that the position operator of (a) is not defined everywhere. 


Question 1.3.2. Consider the shift operator S(x1, x2, ...) = (0, x1, %2,...) in (00). 
Verify that S*S = 1 but SS* Æ 1. 


Question 1.3.3. Apply the Spectral Theorem to the momentum operator i;-. 


Question 1.3.4. Let V = {f € C®(S1)| f (0) = 0}. 

(a) Verify that V is dense in H = L*(S'). 

(b) Verify that D = ig obeys (Df, 2) = (f, Dg) for all f, g € V. 

(c) Construct the adjoint D* of D : V > H. Is D self-adjoint? 

(d) For each à € C, define V, to be the extension of V consisting of all functions smooth 
on the interval [0, 27] and with f(0) = Af (2x). Extend D in the obvious way to V}. 
For which A is D now self-adjoint? 


Question 1.3.5. Let the free group F> act trivially on C. Find a trace for Cx F2. What is 
the centre of Cx Fz? 


Question 1.3.6. Let M be type II,. Prove M is a pre-Hilbert space by defining (x, y) 
appropriately (Hint: use the trace). Let L?(M) be its completion. Show that L?(M) isa 
module over M. 


1.4 Lie groups and Lie algebras 


Undergraduates are often disturbed (indeed, reluctant) to learn that the vector-product 
u x v really only works in three dimensions. Of course, there are several generalisations 
to other dimensions: for example an antisymmetric (N — 1)-ary product (a determinant) 
in N dimensions, or the wedge product of k-forms in 2k + 1 dimensions. Arguably the 
most fruitful generalisation is that of a Lie algebra, defined below. They are the tangent 
spaces of those differential manifolds whose points can be ‘multiplied’ together. 
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As we know, much of algebra is developed by analogy with elementary properties 
of integers. For a finite-dimensional Lie algebra, a divisor is called an ideal; a prime 
is called simple; and multiplying corresponds to semi-direct sum (Lie algebras behave 
simpler than groups but not as simple as numbers). In particular, simple Lie algebras 
are important for similar reasons that simple groups are, and can also be classified (with 
much less effort). One non-obvious discovery is that they are rigid: the best way to 
capture the structure of a simple Lie algebra is through a graph. We push this thought 
further in Section 3.3. For an elementary introduction to Lie theory, [92] is highly 
recommended. 


1.4.1 Definition and examples of Lie algebras 


An algebra is a vector space with a way to multiply vectors that is compatible with the 
vector space structure (i.e. the vector-valued product is required to be bilinear: (au + 
a'u’) x (bu + b'v') =abu x v+ab'u x v +a'bu' x v+a'b'w’ x v). For example, 
the complex numbers C form a two-dimensional algebra over R (a basis is 1 and i = 
./—1; the scalars here are real numbers and the vectors are complex numbers). The 
quaternions are four-dimensional over R and the octonions are eight-dimensional over R. 
Incidentally, these are the only finite-dimensional normed algebras over R that obey the 
cancellation law: u Æ 0 and u x v = 0 implies v = 0 (does the vector-product of R? fail 
the cancellation law?). This important little fact makes several unexpected appearances 
[29]. For instance, imagine a ball (i.e. S 2) covered in hair. No matter how you comb it, 
there will be a part in the hair, or at least a point where the hair leaves in all directions, or 
some such problem. More precisely, there is no continuous nowhere-zero vector field on 
S?. On the other hand, it is trivial to comb the hair on the circle S! without singularity: 
just comb it clockwise, for example. More generally, the even spheres S% can never be 
combed. Now try something more difficult: place k wigs on S*, and try to comb all k of 
them so that at each point on S* the k hairs are linearly independent. This is equivalent 
to saying that the tangent bundle T S* equals S* x R*. The only k-spheres S* that can 
be ‘k-combed’ in this way (i.e. for which there exist k linearly independent continuous 
vector fields) are for k = 1, 3 and 7. This is intimately connected with the existence of C, 
the quaternions and octonions (namely, S!, S3 and S7 are the length 1 complex numbers, 
quaternions and octonions, respectively) [104]. 


Definition 1.4.1 A Lie algebra g is an algebra with product (usually called a ‘bracket’ 
and written [xy]) that is both ‘anti-commutative’ and ‘anti-associative’ : 


[xy] + [yx] = 0; (1.4. 1a) 
[x[yz]] + Ly[zx]] + [z[xy]] = 0. (1.4.1b) 


Like most other identities in mathematics, (1.4.1b) is named after Jacobi (although he 
died years before Lie theory was created). Usually we consider Lie algebras over C, but 
sometimes over R. Note that (1.4.1a) is equivalent to demanding [xx] = 0 (except for 
fields of characteristic 2). 
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A homomorphism ọ : gı —> g2 between Lie algebras must preserve the linear structure 
as well as the bracket — i.e. g is linear and g[xy] = [g(x)g(y)] for all x, y € gı. If g is 
in addition invertible, we call gı, g2 isomorphic. 

One important consequence of bilinearity is that it is enough to know the values of all 
the brackets [xx] fori < j, for any basis {x®, x, . . .} of the vector space g. (The 
reader should convince himself of this before proceeding.) 

A trivial example of a Lie algebra is a vector space g with a bracket identically 0: [xy] = 
0 for all x, y € g. Any such Lie algebra is called abelian, because in any representation 
(i.e. realisation by matrices) its matrices will commute. Abelian Lie algebras of equal 
dimension are isomorphic. 

In fact, the only one-dimensional Lie algebra (for any choice of field F) is the abelian 
one g = F. It is straightforward to find all two- and three-dimensional Lie algebras (over 
C) up to isomorphism: there are precisely two and six of them, respectively (though one of 
the six depends on a complex parameter). Over R, there are two and nine (with two of the 
latter depending on real parameters). This exercise cannot be continued much further — 
for example, not all seven-dimensional Lie algebras (over C say) are known. Nor is it 
obvious that this would be a valuable exercise. We should suspect that our definition of 
Lie algebra is probably too general for anything obeying it to be automatically interesting. 
Most commonly, a classification yields a stale and useless list — a phone book more than 
a tourist guide. 

Two of the three-dimensional Lie algebras are important in what follows. One of 
them is well known to the reader: the vector-product in C*. Taking the standard basis 
{@1, €2, e3} of C3, the bracket can be defined by the relations 


[e1e2] = e3, leie3] = —e2, [e2e3] = e1. (1.4.2a) 


This algebra, denoted A; or sl,(C), deserves the name ‘mother of all Lie algebras’ 
(Section 1.4.3). Its more familiar realisation uses a basis {e, f, h} with relations 


lefl=Ah, [he] = 2e, [hf] = —2f. (1.4.2b) 


The reader can find the change-of-basis (valid over C but not R) showing that equations 
(1.4.2) define isomorphic complex (though not real) Lie algebras. 

Another important three-dimensional Lie algebra is the Heisenberg algebra’ Seis, the 
algebra of the canonical commutation relations in quantum mechanics, defined by 


[xp] = h, [xh] = [ph] = 0. (1.4.3) 
The most basic source of Lie algebras are the n x n matrices with commutator: 
[AB] =[A, B] := AB — BA (1.4.4) 


(the reader can verify that the commutator always obeys (1.4.1)). Let gl,(R) (respectively 
gl,,(C)) denote the Lie algebra of all n x n matrices with coefficients in R (respectively 


° There actually is a family of ‘Heisenberg algebras’, with (1.4.3) being the one of least dimension. 
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C), with Lie bracket given by (1.4.4). More generally, if A is any associative algebra, 
then A becomes a Lie algebra by defining the bracket [vy] = xy — yx. 

Another general construction of Lie algebras starts with any (not necessarily asso- 
ciative or commutative) algebra A. By a derivation of A, we mean any linear map 
ô : A —> A obeying the Leibniz rule 6(ab) = 5(a) b + a 6(b). We can compose deriva- 
tions, but in general the result 6; o 62 won’t be a derivation. However, an easy calculation 
verifies that the commutator [6,42] = 6; o 52 — 8 o 6, of derivations is also a derivation. 
Hence the vector space of derivations is naturally a Lie algebra. If A is finite-dimensional, 
so will be its Lie algebra of derivations. 

In particular, vector fields X € Vect(M) are derivations. We can compose them X o Y, 
but this results in a second-order differential operator. Instead, the natural ‘product’ is 
their commutator [X, Y] = X o Y —Y o X, as it always results in a vector field. Vect(M) 
with this bracket is an infinite-dimensional Lie algebra. For example, recall Vect(S!) from 
Section 1.2.2 and compare 

d d q? j d 
(5) o Log) = f(@)8@) qg2 T fg @) T’ 


d d , , d 
Fogos] = (f@)g (0) — FO sO). 


Incidentally, another natural way to multiply vector fields X, Y of vector fields, the Lie 
derivative £y(Y) defined in Section 1.2.2, equals the commutator [X, Y] and so gives 
the same Lie algebra structure on Vect(M). 


1.4.2 Their motivation: Lie groups 


From Definition 1.4.1 it is far from clear that Lie algebras, as a class, should be natural 
and worth studying. After all, there are infinitely many possible axiomatic systems: 
why should this one be anything special a priori? Perhaps the answer could have been 
anticipated by the following line of reasoning. 


Axiom Groups are important and interesting. 
Axiom Manifolds are important and interesting. 


Definition 1.4.2 A Lie group G is a manifold with a compatible group structure. 


This means that ‘multiplication’ u : G x G —> G (which sends the pair (a, b) to ab) and 
‘inverse’ ı : G —> G (which sends a to a~') are both differentiable maps. The manifold 
structure (Definition 1.2.3) of G can be chosen as follows: fix any open set U, about the 
identity e € G; then the open set U, := gU; will contain g € G. The real line R is a 
Lie group under addition: obviously, u and ı defined by u(a, b) = a+ band i(a) = —a 
are both differentiable. A circle is also a Lie group: parametrise the points with the angle 
0 defined mod 2x; the ‘product’ of the point at angle 6; with the point at angle 62 is 
the point at angle 6, + 02. Surprisingly, the only other k-sphere that is a Lie group is S? 
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(the product can be defined using quaternions of unit length,!° or by identifying S? with 
the matrix group SU2(C)). This is because it is always possible to ‘n-comb the hair’ on 
an n-dimensional Lie group (Section 1.4.1) — more precisely, the tangent bundle TG of 
any Lie group is trivial G x R”, something easy to see using the charts Uy. 

A complex Lie group G is a complex manifold with a compatible group structure. 
For example, the only one-dimensional compact real Lie group is S', whereas there are 
infinitely many compact one-dimensional complex Lie groups, namely the tori or “elliptic 
curves’ C/L, for any two-dimensional lattice L in the plane C. Thought of as real Lie 
groups (i.e. forgetting their complex structure), elliptic curves all are real-diffeomorphic 
to S! x S!; they differ in their complex-differential structure. We largely ignore the 
complex Lie groups; unless otherwise stated, by ‘Lie group’ we mean ‘real Lie group’.!! 

Many but not all Lie groups can be expressed as matrix groups whose operation is 
matrix multiplication. The most important are GL, (invertible n x n matrices) and SL, 
(ones with determinant 1). 

Incidentally, Hilbert’s 5th problem!” asked how important the differentiability hypoth- 
esis is here. It turns out it isn’t (see [569] for a review): if a group G is a topological 
manifold, and u and ų are merely continuous, then it is possible to endow G with a 
differentiable structure in one and only one way so that u and ı are differentiable. 

In any case, a consequence of the above axioms is surely: 


Corollary Lie groups should be important and interesting. 


Indeed, Lie groups appear throughout mathematics and physics, as we will see again 
and again. For example, the Lie groups of relativistic physics (Section 4.1.2) come 
from the group O3,;(R) consisting of all 4 x 4 matrices A obeying AGA‘ = G, where 
G = diag(1, 1, 1, —1) is the Minkowski metric. Any such A must have determinant +1, 
and has |A44| > 1; these 2 x 2 possibilities define the four connected components of 
O3,;(R). The (restricted) Lorentz group SO; (R) consists of the determinant 1 matrices 
A in O3, (R) with A44 > 1. It describes rotations in 3-space, as well as ‘boosts’ (changes 
of velocity). SOF ı(R) has a double-cover (i.e. an extension by Z2) isomorphic to SL2(C), 
which is more fundamental. Finally, the Poincaré group is the semi-direct product of 
SO}, (R) with R4, corresponding to adjoining to SOF ,(R) the translations in space-time 
Rt. The Lorentz group is six-dimensional, while the Poincaré group is 10-dimensional. 

As said in Section 1.2.2, the tangent spaces of manifolds are vector spaces of dimension 
equal to that of the manifold. The space structure is easy to see for Lie groups: choose 
any infinitesimal curves u = (g(t))., v = (h(t). € T.G, so g(0) = h(O) = e, and let 
a,b € R. Then au + bv corresponds to the curve t + g(at) h(t). 

Not surprisingly, G acts on the tangent vectors: let u € T,G correspond to curve h(t), 
with A(0) = h, and define gu for any g € G to be the vector in T,,G corresponding 


10 Similarly, the 7-sphere inherits from the octonions a non-associative (hence nongroup) product, 
compatible with its manifold structure. 

11 Our vector spaces (e.g. Lie algebras) are usually complex; our manifolds (e.g. Lie groups) are usually real. 

12 Tn the International Congress of Mathematicians in 1899, David Hilbert announced several problems 
chosen to anticipate (and direct) major areas of study. His list was deeply influential. 
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to the curve t +> g(A(t)). This means that conjugating gug~! for any element u € T.G 
gives another element of T.G, that is T.G carries a representation of the group G called 
the adjoint representation. 

This is all fine. However, we have a rich structure on our manifold — namely the group 
structure — and it would be deathly disappointing if this adjoint representation were the 
high-point of the theory. Fortunately we can go much further. Consider any u, v € T.G, 
where v = (g(t))e. Then g(t)u g(t)! lies in the vector space T,G for all t, and hence 
so will the derivative. It turns out that the quantity 


d 
[wv] = (gu gt) limo (1.4.5) 


depends only on u and v (hence the notation). A little work shows that it is bilinear, 
anti-symmetric, and anti-associative. That is, TeG is a Lie algebra! 

In the last subsection we indicated that Vect(M ) carries a Lie algebra structure, for any 
manifold M. It is tempting to ask: when M is a Lie group G, what is the relation between 
the infinite-dimensional Lie algebra Vect(G), and the finite-dimensional Lie algebra 
TeG? Note that G acts on the space Vect(M) by ‘left-translation’, that is if X is a vector 
field, which we can think of as a derivation of the algebra C™(G) of real-valued functions 
on G, and g € G, then g.X is the vector field given by (g.X)(f)(1) = X(f)(gh). Then 
the Lie algebra T,G is isomorphic to the subalgebra of Vect(G) consisting of the ‘left- 
invariant vector fields’, that is those X obeying g.X = X. Given any manifold M, the 
Lie algebra Vect(M) corresponds to the infinite-dimensional Lie group Difft(M) of 
orientation-preserving diffeomorphisms of M; when M is itself a Lie group, the left- 
invariant vector fields correspond in Diff*(M ) to acopy of M given by left-multiplication. 


Fact The tangent space of a Lie group is a Lie algebra. Conversely, any (finite- 
dimensional real or complex) Lie algebra is the tangent space T.G to some Lie group. 


For example, consider the Lie group G = SL, (R). Let A(t) = (A; j(t)) be any curve in 
G with A(O) = I„. We see that only one term in the expansion of det A(t) can contribute 
to its derivative at t = 0, namely the diagonal term Aj;(t)--- Ann(t), so differentiating 
det(A(t)) = 1 at t = 0 tells us that A1; (0) + --- + Aj,,(0) = 0. Thus the tangent space 
T;,G consists of all trace-zero n x n matrices, since the algebra like the group must be 
(n? — 1)-dimensional. We write it s1, (R). Now choose any matrices U, V € sl,(R), and 
let A(t) be the curve in SL,(R) corresponding to V. Differentiating A(t) A(t)! = In, 
we see that (A~!)’ = —A~!A’A7! and thus (1.4.5) becomes 


[VU] =A(0)UI,'+1,U (-1,' A'O)I,'). 


In other words, the bracket in sl, (IR) — as with any other matrix algebra — is given by the 
commutator (1.4.4). 
Given the above fact, a safe guess would be: 


Conjecture Lie algebras are important and interesting. 


From this line of reasoning, it should be expected that historically Lie groups arose first. 
Indeed that is the case: Sophus Lie introduced them in 1873 to try to develop a Galois 
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theory for ordinary differential equations. Galois theory can be used for instance to show 
that not all fifth degree (or higher) polynomials can be explicitly ‘solved’ using radicals 
(Section 1.7.2). Lie wanted to study the explicit solvability (integrability) of differential 
equations, and this led him to develop what we now call Lie theory. The importance of 
Lie groups, however, has grown well beyond this initial motivation. 

A Lie algebra, being a linearised Lie group, is much simpler and easier to handle. The 
algebra preserves the local properties of the group, though it loses global topological 
properties (like compactness). A Lie group has a single Lie algebra, but a Lie algebra 
corresponds to many different Lie groups. The Lie algebra corresponding to both R and 
S! is g = R with trivial bracket. The Lie algebra corresponding to both S? = SU2(C) 
and SO3(R) is the vector-product algebra (1.4.2a) (usually called s03(R)). 

We saw earlier that many (but not all) examples of Lie groups are matrix groups, that 
is subgroups of GL, (R) or GL, (C). The Ado-Iwasawa Theorem (see e.g. chapter VI of 
[314]) says that all finite-dimensional Lie algebras (over any field) are realisable as Lie 
subalgebras of gl,,(IR) or gl,,(C). This is analogous to Cayley’s Theorem, which says any 
finite group is a subgroup of some symmetric group S,,. Now, choose any Lie algebra 
g C gl, (C). Let G be the topological closure of the subgroup of GL,,(C) generated by 
all matrices e^ for A € g, where e^ is defined by the Taylor expansion 


2 1 
A_ + ak 
ee > ke : 
k=0 
Then the Lie group G has Lie algebra g. Remarkably, the group operation on G (at 
least close to the identity) can be deduced from the bracket: the first few terms of the 
Baker—Campbell—-Hausdorff formula read 


exp(X) exp(Y) = exp (x +Y + rxy] + Z iixYIx] + SIV] +). 
(1.4.6) 
See, for example, [475] for the complete formula and some of its applications. 

We saw earlier that the condition ‘determinant = 1’ for matrix groups translates to 
the Lie algebra condition ‘trace = 0’. This also follows from the identity det(e4) = e" 4, 
which follows quickly from the Jordan canonical form of A. 

Of course all undergraduates are familiar, at least implicitly, with exponentiating 
operators. Taylor’s Theorem tells us that for any analytic function f and any real number 
a, the operator e“& sends f(x) to f(x + a). Curiously, the operator log(*) also has a 
meaning, in the context of, for example, affine Kac-—Moody algebras [344]. 

The definition of a Lie algebra makes sense over any field K. However, the definition 
of Lie groups is much more restrictive, because they are analytic rather than merely 
linear and hence require fields like C, R or the p-adic rationals Q p- A good question is: 
which Lie-like group structures do Lie algebras correspond to, for the other fields? A 
good answer is: algebraic groups, which are to algebraic geometry what Lie groups are 
to differential geometry. See, for example, part III of [92] for an introduction. 

The main relationship between real Lie groups and algebras is summarised by: 
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Theorem 1.4.3 To any finite-dimensional real Lie algebra g, there is a unique con- 
nected simply-connected Lie group G, called the universal cover group. If G is any other 
connected Lie group with Lie algebra g, then there exists a discrete subgroup H of the 
centre of G, such thatG = G/H and H = 1(G), the fundamental group of G. 


The definitions of simply-connected and z are given in Section 1.2.3. The universal cover 
R of the Lie algebra R is the additive group R; the circle G = S! has the same Lie algebra 
and can be written as S! = R/Z. The real Lie groups SU2(C) and SO3(R) both have Lie 
algebra $03(IR); SU2(C) = S? is the universal cover, and SO3(R) = SU2(C)/ {+72} is the 
3-sphere with antipodal points identified. 2;(SL2(R)) = S', and its universal cover (see 
Question 2.4.4) is an example of a Lie group that is not a matrix group. 

So the classification of (connected) Lie groups reduces to the much simpler classifi- 
cation of Lie algebras, together with the classification of discrete groups in the centre 
of the corresponding G. The condition that G be connected is clearly necessary, as the 
direct product of a Lie group with any discrete group leaves the Lie algebra unchanged. 

Lie group structure theory is merely a major generalisation of linear algebra. The 
basic constructions familiar to undergraduates have important analogues valid in many 
Lie groups. For instance, in our youth we were taught to solve linear equations and invert 
matrices by reducing a matrix to row-echelon form using row operations. This says that 
any matrix A € GL,(C) can be factorised A = BPN, where N is upper-triangular with 
1’s on the diagonal, P is a permutation matrix and B is an upper-triangular matrix. This 
is essentially the Bruhat decomposition of the Lie group GL,,(C). More generally (where 
it applies to any ‘reductive’ Lie group G), P will be an element of the so-called Weyl 
group of G, and B will be in a “Borel subgroup’. For another example, everyone knows 
that any nonzero real number x can be written uniquely as x = (+1)- |x|, and many 
of us remember that any invertible matrix A € GL,(R) can be uniquely written as a 
product A = OP, where O is orthogonal and P is positive-definite. More generally, this 
is called the Cartan decomposition for a real semi-simple Lie group. This encourages us 
to interpret a linear algebra theorem as a special case of a Lie group theorem . . . a squirrel. 


1.4.3 Simple Lie algebras 


The reader already weary of such algebraic tedium won’t be surprised to read that the 
typical algebraic definitions can be imposed on Lie theory. The analogue of direct product 
of groups here is direct sum gı ® go, with bracket [(x1, x2), (1, y2)] = (x1 yılı, [x2y2]2). 
Semi-direct sum is defined as usual. The analogue of normal subgroup here is called 
an ideal: a subspace h of g such that [gh] := span{[xy]|x € g, y € b} is contained in 
h. A Lie group N is a normal subgroup of Lie group G iff the Lie algebra of N is an 
ideal of that of G. Given an ideal h of a Lie algebra g, the quotient space g/h has a 
natural Lie algebra structure; if g : gi — g2 is a Lie algebra homomorphism, then the 
kernel ker(g) is an ideal of gı and the image (g1) is a subalgebra of go isomorphic 
to g;/ker(gy). The name ‘ideal’ comes from number theory (Section 1.7.1). The centre 
Z(g) := {x € g | [xg] = 0} of g always forms an ideal, as does [gg]. 
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A simple Lie algebra is one with no proper ideals. It is standard though to exclude 
the one-dimensional Lie algebras, much like is often done with the cyclic groups Zp. 
A semi-simple Lie algebra is defined as any g for which [gg] = g; it turns out that g 
is semi-simple iff g is the (Lie algebra) direct sum ®;g; of simple Lie algebras g;. A 
reductive Lie algebra g is defined by the relation [gg] ® Z(g) = g; g is reductive iff g is 
the direct sum of a semi-simple Lie algebra with an abelian one. Of course simple Lie 
algebras are more important, but semi-simple and reductive ones often behave similarly. 

The finite-dimensional simple Lie algebras constitute an important class of Lie alge- 
bras. Although it is doubtful the reader has leapt out of his chair with surprise at this 
pronouncement, it is good to see explicit indications of this importance. 

Simple Lie algebras serve as building blocks for all other finite-dimensional Lie alge- 
bras, in the following sense (called Levi decomposition — see, for example, chapter III.9 
of [314] for a proof): any finite-dimensional Lie algebra g over C or R can be writ- 
ten as a vector space in the form g = t @ h, where h is the largest semi-simple Lie 
subalgebra of g, and t is called the radical of g and is by definition the maximal ‘solv- 
able’ ideal of g. This means g is the semi-direct sum of t with h = g/t. A solvable Lie 
algebra is the repeated semi-direct sum by one-dimensional Lie algebras; more con- 
cretely, it is isomorphic to a subalgebra of the upper-triangular matrices in some gl,- 
Levi decomposition is the Lie theoretic analogue of the Jordan—Hélder Theorem of 
Section 1.1.2. 

It is reassuring that we can also see the importance of simple Lie algebras geometri- 
cally: given any finite-dimensional real Lie group that is ‘compact’ as a manifold (i.e. 
bounded and contains all its limit points), its Lie algebra is reductive. Conversely, any 
reductive real Lie algebra is the Lie algebra of a compact Lie group. 

In our struggle to understand a structure, it is healthy to find new ways to capture 
old information. Let us begin with a canonical way to associate linear endomorphisms 
(which the basis-hungry of us can regard as square matrices) to elements of the Lie 
algebra g. Define the ‘adjoint operator’ adx : g — g to be the linear map given by 
(ad x)(y) = [xy]. In this language, anti-associativity of the bracket translates to the facts 
that: (i) for each x € g, adx is a derivation of g; and (ii) the assignment x > adx 
defines a ‘representation’ of g, called the adjoint representation (more on this next 
section). 

The point is that there are basis-independent ways to get numbers out of matrices. The 
Killing form k : g x g —> C of a (complex) Lie algebra g is defined by 


K(x|y) := tr(ad x o ad y), Yx, y Eg. (1.4.7a) 


By ‘trace’ we mean to choose a basis, get matrices, and take the trace in the usual way; 
the answer is independent of the basis chosen. The Killing form is symmetric, respects 
the linear structure of g (i.e. is bilinear) and respects the bracket in the sense that 


k([xy]|z) = «(x|[yz]), Yx, y,Zz €g. (1.4.7b) 


This property of « is called invariance (Question 1.4.6(b)). 
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Table 1.3. Freudenthal’s Magic Square: the Lie algebra g(A,, A2) 


Ai R C quat oct 
R $03(R) $u3(R) sp,(R) Fy 
C su (R) su (R) 9 su (R) sus(R) E6 
quat SP3 R) sus(R) $012(IR) E, 
oct F4 E6 Ey Eg 


Let A, B be twon x n real matrices; then 


tr(AB) = Xe AiiBii + x (Ajj Bji + Aji Bij), 
i=1 l<i<j<n 
which can be interpreted as an indefinite inner-product on R”. Thus the Killing form 
«(x|y) should be thought of as an inner-product on the vector space g. It arose historically 
by expanding the characteristic polynomial det(ad x — AJ) (Question 1.4.6(c)). 
An inner-product on a complex space V has only one invariant: the dimension of the 
subspace of null vectors. More precisely, define the radical of the Killing form to be 


s(k) = {x € g|K(x|y) = OVy € g}. 
By invariance of x, s is an ideal. It is always solvable. 


Theorem 1.4.4 (Cartan’s criterion) Let g be a (complex or real) finite-dimensional 
Lie algebra. Then g is semi-simple iff k is nondegenerate, i.e. $(k) = 0. 


Moreover, g is solvable iff [gg] € s(x). The nondegeneracy of the Killing form plays 
a crucial role in the theory of semi-simple g. For instance, it is an easy orthogonality 
argument that a semi-simple Lie algebra is the direct sum of its simple ideals. 
The classification of simple finite-dimensional Lie algebras over C was accomplished 
at the turn of the century by Killing and Cartan. There are four infinite families A, (r > 1), 
B, (r > 3),C; (r > 2) and D, (r > 4), and five exceptionals E6, E7, Eg, F4 and G2. A, 
can be thought of as $l }1 (C), the (r + 1) x (r + 1) matrices with trace 0. The orthogonal 
algebras B, and D, can be identified with so2,4;(C) and s0,(C), respectively, where 
s0,(C) is all n x n anti-symmetric matrices A‘ = — A. The symplectic algebra C, is 
0 L 
-I, 0 
1, is the identity. In all these cases the bracket is the commutator (1.4.4). The exceptional 
algebras can be constructed using, for example, the octonions. For instance, G2 is the 
algebra of derivations of octonions. In fact, given any pair A;, A2 of normed division 
rings (so A; are R, C, the quaternions or the octonions), there is a general construction of 
a simple Lie algebra g(A;, A2) (over R) — see, for example, section 4 of [29]. The results 
are summarised in Freudenthal’s Magic Square (Table 1.3). The interesting thing here 
is the uniform construction of four of the five exceptional Lie algebras. In Sections 1.5.2 
and 1.6.2 we give further reasons for thinking of the exceptional Lie algebras as fitting 
into a sequence — a nice paradigm whenever multiple exceptional structures are present. 


Sp>,(C), i.e. all 2r x 2r matrices A obeying AQ = —QA‘, where Q = and 
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To verify that (1.4.2b) truly is sI,(C), put 


0 1 0 0 1 0 
at al oe a) n=(j T (1.4.8) 


The names A, B, C, D have no significance: since the four series start atr = 1, 2, 3, 4, 
they were called A, B, C, D, respectively. Unfortunately, misfortune struck: at random 
By = C2 was called orthogonal, although the affine Coxeter-Dynkin diagrams (Fig- 
ure 3.2) reveal that it is actually symplectic and only accidentally looks orthogonal. In 
hindsight the names of the B- and C-series really should have been switched. 

For reasons we explain in Section 1.5.2, all semi-simple finite-dimensional Lie alge- 
bras over C have a presentation of the following form. 


Definition 1.4.5 (a) A Cartan,,; matrix A is an n x n matrix with integer entries dij, 
such that: 


Cl. each diagonal entry aii = 2; 

c2. each off-diagonal entry aij, i 4 j, is a nonpositive integer; 

C3. the zeros in A are symmetric about the main diagonal (i.e. aij = 0 iff aj; = 0); and 

c4. there exists a positive diagonal matrix D such that the product AD is positive-definite 
(i.e. (ADY = AD and x' ADx > 0 for any real column vector x # 0). 


(b) Givenany Cartanss matrix A, define a Lie algebra g(A) by the following presentation. 
It has 3n generators ei, fi, hi, fori = 1,...,n, and obeys the relations 

R1. [e; fj] = ôijhi, [hie ;] = aije;, [hi fij] = —aij fj, and [hjhj] = 0, for alli, j; and 

R2. (ade;)! “ie; = (ad f;)!~% f; = O whenever i + j. 


‘ss’ stands for ‘semi-simple’; it is standard to call these matrices A ‘Cartan matrices’, but 
this can lead to terminology complications when in Section 3.3.2 we doubly generalise 
Definition 1.4.5(a). As always, ade : g — g is defined by (ade) f = [ef], so if a;; = 0 
then [e;e;] = 0, while ifa;; = —1 then [e;[e;e ;]] = 0. It is a theorem of Serre (1966) that 
g(A) is finite-dimensional semi-simple, and any complex finite-dimensional semi-simple 
Lie algebra g equals g(A) for some Cartan,, matrix A. 

The terms ‘generators’ and ‘basis’ are sometimes confused. Both build up the whole 
algebra; the difference lies in which operations you are permitted to use. For a basis, you 
are only allowed to use linear combinations (i.e. addition of vectors and multiplication 
by numbers), while for generators you are also permitted multiplication of vectors (the 
bracket here).‘Dimension’ refers to basis, while ‘rank’ usually refers to generators. For 
instance, the (commutative associative) algebra of polynomials in one variable x is 
infinite-dimensional, but the single polynomial x is enough to generate it (so its rank 
is 1). Although g(A) has 3r generators, its dimension will usually be far greater. 

The entries of Cartan,, matrices are mostly zeros, so it is more transparent to realise 
them with a graph, called the Coxeter-Dynkin diagram.!? The diagram corresponding 


13 The more common name ‘Dynkin diagram’ is historically inaccurate. Coxeter was the first to introduce 
these graphs, originally in the context of reflection groups, but in 1934 he applied them also to Lie 
algebras. Dynkin’s involvement with them occurred over a decade later. 
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o—o o o o=o o=S=0 
A2 A{@A] Bo G2 
Fig. 1.16 The rank 2 Coxeter-Dynkin diagrams. 


to matrix A has r nodes; the ith and jth nodes are connected with a;;a;; edges, and if 
aij 4 aji, we put an arrow over those edges pointing to i if aj; < aji. 
For example, the 2 x 2 Cartan,, matrices are 


(4.2): G2) (4 2) È 7) 


The third and fourth matrices can be replaced by their transposes, which correspond to 
isomorphic algebras. Their Coxeter-Dynkin diagrams are given in Figure 1.16. 

To get a better feeling for relations R1, R2, consider a fixed i. The generators e = 
ei, f = fi, h = h; obey (1.4.2b). In other words, every node in the Coxeter-Dynkin 
diagram corresponds to a copy of the A; Lie algebra. The lines connecting these nodes 
tell how these r copies of A, intertwine. For instance, the first Cartan matrix given above 
corresponds to the Lie algebra Az = sI3(C). The two A, subalgebras that generate it 
(one for each node) can be chosen to be the trace-zero matrices of the form 


x *« 0 0 0 0 
x x Of, O x x 
0 0 0 O x x 


The Lie algebra corresponding to a disjoint union U;D; of diagrams is the direct 
sum ®;g:(D;) of algebras. Thus we may require the matrix A to be indecomposable, 
or equivalently that the Coxeter-Dynkin diagram be connected, in which case the Lie 
algebra g(D) will be simple. Of the four in Figure 1.16, only the second is decomposable. 


Theorem 1.4.6 (a) The complete list of indecomposable Cartanss matrices, or equiv- 
alently the connected Coxeter-Dynkin diagrams, is given in Figure 1.17. The series A,, 
B,, C,, D, are defined forr > 1,r > 3,r > 2,r > 4, respectively. 

(b) The complete list of finite-dimensional simple Lie algebras over C are g(D) for each 
of the Coxeter-Dynkin diagrams in Figure 1.17. 


This classification changes if the field — the choice of scalars — is changed. As always, 
C is better behaved than R because every polynomial can be factorised completely over 
C (we say C is algebraically closed). This implies every matrix has an eigenvector 
over C, something not true over R. Over C, each simple algebra has its own sym- 
bol X, € {A;,..., G2}; over R, each symbol corresponds to a number of inequivalent 
algebras. See section VI.10 of [348] or chapter 8 of [214] for details. For example, ‘A,’ 
corresponds to three different real simple Lie algebras, namely the matrix algebras slh (R), 
sl2(C) (interpreted as a real vector space) and su2(C) = so03(R). The simple Lie alge- 
bra classification is known in any characteristic p > 7 (see e.g. [559]). Smaller primes 
usually behave poorly, and the classification for characteristic 2 is probably hopeless. 
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Fig. 1.17 The Coxeter-Dynkin diagrams of the simple Lie algebras. 


Simple Lie algebras need not be finite-dimensional. An example is the Witt algebra 
Witt, defined (over C) by the basis! £,, n € Z, and relations 


[Emt] = (m — N)emtn- (1.4.9) 


Using the realisation £, = —ie7i”® £ , Witt is seen to be the polynomial subalgebra of the 
complexification C @ Vect(S!) (i.e. the scalar field of Vect(S') is changed from R to C). 
Incidentally, infinite-dimensional Lie algebras need not have a Lie group: for example, 
the real algebra Vect(S!) has the Lie group Diff(S!) of diffeomorphisms S! —> S', but 
its complexification C @ Vect(S!) has no Lie group (Section 3.1.2). The Witt algebra is 
fundamental to Moonshine. We study it in Section 3.1.2. 


Question 1.4.1. Let G be a finite group, and CG be its group algebra (i.e. all formal 
linear combinations ` g agg over C). Verify that CG becomes a Lie algebra when given 
the bracket [g, h] = gh — hg (extend linearly to all of CG). Identify this Lie algebra. 


Question 1.4.2. Let K be any field. Find all two-dimensional Lie algebras over K, up to 
(Lie algebra) isomorphism. 


Question 1.4.3. Prove the Witt algebra (1.4.9) is simple. 


Question 1.4.4. Prove the Lie algebraic analogue of the statement that any homomor- 
phism f : G — H between simple groups is either constant or a group isomorphism. 


Question 1.4.5. The nonzero quaternions a1 + bi + cj + dk, for a, b, c,d € R, forma 
Lie group by multiplication (recall that i? = j}? = k? = —1, ij = —ji =k, jk = —kj =i 
and ki = —ik = j). Find the Lie algebra. 


Question 1.4.6. (a) Verify that ad [vy] = ad x o ad y — ad y o ad x, for any elements x, y 
in a Lie algebra g. 
(b) Verify that the Killing form is invariant (i.e. obeys (1.4.7b)) for any Lie algebra. 


14 Tn order to avoid convergence complications, only finite linear combinations of basis vectors are typically 
permitted in algebra. Infinite linear combinations would require taking some completion. 
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(c) Let g be n-dimensional and semi-simple. Choose any x € g. Verify that the coefficient 
of 4"~? in the characteristic polynomial det(ad x — A7) is proportional to «(x |x). 


2 
Question 1.4.7. Consider the complex Lie algebra g(A), for A = ( 1 2 


in Definition 1.4.5(b). 
(a) Prove that a basis for g is {e;, fi, hi, [e1e2], [fi f2]} and thus that g is eight- 
dimensional. Prove from first principles that g is simple. 


) , defined 


(b) Verify that the following generates a Lie algebra isomorphism of g with sl (C): 


0 1 0 0 0 0 1 0 0 
erqļ0 0 01, fi {1 0 0], hie {0 1 01, 
0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 
ar{too li, h=|0 0 O], h= |0 1 0 
000 0 1 0 0 0 -i 


Question 1.4.8. Show that property c3 can be safely dropped. That is, given a Z-matrix 
A obeying cl, c2 and c4, show that there is a Cartan matrix A’ such that the Lie algebra 
g(A) (defined as in Definition 1.4.5(b)) is isomorphic to g(A’). 


Question 1.4.9. Are Vect(R) and Vect(S!) isomorphic as Lie algebras? 


1.5 Representations of simple Lie algebras 


The representation theory of the simple Lie algebras can be regarded as an enormous 
sin(nx) 
sin(x) 


generalisation of trigonometry. For instance, the facts that can be written as a 


polynomial in cos(x) for any n € Z, and that 
sin(mx) sin(nx) i : A 
— ~ = sin((m + n)x) + sin((m +n — 2)x) + --- + sin((m — n)x) 
sin(x) 
for any m,n € N are both easy special cases of the theory. Representation theory is 
vital to the classification and structure of simple Lie algebras, and leads to the beautiful 
geometry and combinatorics of root systems. The relevance of Lie algebras to Moonshine 
and conformal field theory — which is considerable — is through their representations. 
The book [219] is a standard treatment of Lie representation theory; it is presented with 
more of a conformal field theoretic flavour in [214]. 


1.5.1 Definitions and examples 


Although we have learned over the past couple of centuries that commutativity can 
be dropped without losing depth and usefulness, most interesting algebraic structures 
obey some form of associativity. In fact, true associativity (as opposed to, for example, 
anti-associativity) really simplifies the arithmetic. Given the happy accident that the 
commutator [x, y] := xy — yx in any associative algebra obeys anti-associativity, it is 
tempting to seek ways in which associative algebras X can ‘model’ or represent a given 
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Lie algebra. That is, we would like a map p : g —> 2 that preserves the linear structure 
(i.e. p is linear) and sends the bracket [xy] in g to the commutator [p(x), o(y)] in 2. 

In practise groups often appear as symmetries, and algebras as their infinitesimal gen- 
erators. These symmetries often act linearly. In other words, the preferred associative 
algebras are usually matrix algebras, and so we are interested in Lie algebra homomor- 
phisms p : g —> gl,,. The dimension of this representation is the number n. 

Completely equivalent to a representation is the notion of ‘g-module M’, as is the case 
for finite groups (Section 1.1.3). A g-module is a vector space M on which g acts (on 
the left) by product x.v, for x € g, v € M. This product must be bilinear, and must obey 
[xy].v = x.(y.v) — y.(x.v). We use ‘module’ and ‘representation’ interchangeably. 

Lie algebra modules behave much like finite group modules. Let p; : g > gl(V;) be 
two representations of g. We define their direct sum pı ® p2 : g —> gl(Vi ® V2) as usual 
by 


(01 B p2X)(U1, V2) = (O11), 02 )(V2)), Vx €g, v; € Vi. (1.5.1a) 


Lie algebras are special in that (like groups) we can multiply their representations: define 
the tensor product representation pı ® m : g —> gI(Vıi ® V2) through 


(P1 8 ax) @ v2) = (p1(X) v1) 8 v2 + vı ® (p2(x)v2), Vx Eg, v; € Vj. 
(1.5.1b) 


Recall that the vector space V; ® V2 is defined to be the span of all vj ® v2, so the 
value (0; ® 02)(x)(v) on generic vectors v € V; ® V2 requires (1.5.1b) to be extended 
linearly. It is easy to verify that (1.5.1b) defines a representation of g; the obvious but 
incorrect attempt (01(«)v1) ® (e2(x)v2) would lose linear dependence on x. As usual, 
the dimension of 0; ® (2 is dim(e;) + dim(2), while dim(p; ® p2) is dim(;) dim(p2). 
A rich representation theory requires in addition a notion of dual or contragredient. 
Recall that the dual space V* is the space of all linear functionals v* : V —> C. Given a 
g-module V , the natural module structure on V* is the contragredient, defined by 


(x.v*)(u) = —v*(x.u), Vx eg, vv eV*,ueV. (1.5.1c) 


This defines p*(x)v* € V* by its value at each u € V. In terms of matrices, (1.5.1c) 
amounts to choosing p*(x) to be — p(x), the negative of the transpose of p(x). The 
negative sign is needed for the Lie brackets to be preserved. 

The definition of unitary representation p for finite groups says each p(g) should 
be a unitary matrix. Since the exponential of a Lie algebra representation should be a 
Lie group representation, we would like to say that a unitary representation p of a Lie 
algebra should obey p(x)? = —p(x) for any x € g, where ‘f’ is the adjoint (complex 
conjugate-transpose), that is to say all matrices p(x) should be anti-self-adjoint. This 
works for real Lie algebras, but not for complex ones: if p(x) is anti-self-adjoint, then 
p(ix) = ip(x) will be self-adjoint! 

The correct notion of unitary representation p : g —> gl(V) for complex Lie alge- 
bras is that there is an anti-linear map w : g —> g obeying w[xy] = —[wx, wy], such 
that p(x)! = p(wx). ‘Anti-linear’ means w(ax + y) = aa(x) + w(y). Equivalently, p is 
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unitary if the complex vector space V has a Hermitian form (u, v) € C on it, such that 
(u, p(x)v) = (e(@x)u, v). (1.5.2) 


For the case of real Lie algebras, wx = —x works. For the complex semi-simple Lie 
algebra g(A) of Definition 1.4.5, the most common choice is we; = fi, wf; = ei, whi = 
h; (this is the negative of the so-called Chevalley involution). 

A submodule of a g-module V is a subspace U C V obeying g.U C U. The obvious 
submodules are {0} and V; an irreducible module is one whose only submodules are 
those trivial ones. Schur’s Lemma (Lemma 1.1.3) holds verbatim, provided G is replaced 
with a finite-dimensional Lie algebra g, and p, p' are also finite-dimensional. 

Finding all possible modules, even for the simple Lie algebras, is probably hopeless. 
For example, all simple Lie algebras have uncountably many irreducible ones. However, 
it is possible to find all of their finite-dimensional modules. 


Theorem 1.5.1 Let g be a complex finite-dimensional semi-simple Lie algebra of rank 
r. Then any finite-dimensional g-module is completely reducible into a direct sum of 
irreducible modules. Moreover, there is a unique unitary irreducible module L(à) for 
each r-tuple à = (Aj, ..., àr) of nonnegative integers, and all irreducible ones are of 
that form. 


Let P} = P(g) denote the set of all r-tuples A of nonnegative integers; A € P4 are 
called dominant integral weights. The module L(A) is called the irreducible module 
with highest weight à. We explain how to prove Theorem 1.5.1 and construct L(A) in 
Section 1.5.3, but to get an idea of what L(A) looks like, consider A; from (1.4.2b). For 
any à € C, define x9 Æ 0 to formally obey h.x9 = Axo and e.xọ = 0. Define inductively 
Xi41 c= f.x; fori = 0, 1,... The span of all x;, call it M (à), is an infinite-dimensional 
A-module: the calculations h.xj4, = h.(f.x;) = ((Af] + fh).x; = (—2f + fh).x; and 
e.Xig1 = e.(f.xi) = (lef] + fe).x; = (h + fe).x; show inductively that h.x, = (A — 
2m)Xm and e.Xxm = (A — m + 1)m, Xm—1. The linear independence of the x; follow from 
these. M (à) is called a Verma module with highest weight A, and xo its highest-weight 
vector. 
Is M(A) unitary? Here, w interchanges e and f, and fixes h. The calculation 


(xi, Xi) = (f-xi-1, Xi) = (Xj-1, eX) = (A — i + 1)(xi—1, Xi—1) (1.5.3) 


tells us that the norm-squares (x;, x;) and (x;-1,.x;-1) can’t both be positive, if i is 
sufficiently large. Thus no Verma module M (A) is unitary. 

Now specialise to à = n € N := {0, 1, 2,...}. Since e.x,4) = Oandh.x,4) = (~n — 
2)Xn41, M(n) contains a submodule with highest-weight vector x,+1, isomorphic to 
M(—n — 2). Xn41 is called a singular or null vector, because by (1.5.3) it has norm- 
squared (Xn41,Xn+1) = 0. In other words, we could set x,41 := 0O and still have 
an A-module — a finite-dimensional module L(n) := M(n)/M(—n — 2) with basis 
{x0, X1,.--,X,} and dimension n + 1. This basis is orthogonal and L (n) is unitary. 

For example, the basis {xo, x1} of L(1) recovers the representation sl,(C) of (1.4.8). 
The adjoint representation of Section 1.5.2 is L(2). 
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The situation for the other simple Lie algebras X, is similar (Section 1.5.3). On the 
other hand, non-semi-simple Lie algebras have a much more complicated representa- 
tion theory. They have finite-dimensional modules that aren’t completely reducible. For 
example, given any finite-dimensional representation p : g > gl(V) of any solvable Lie 
algebra g, a basis can be found for V such that every matrix p(x) will be upper-triangular 
(i.e. the entries p(x);; will equal 0 when i > j) — see Lie’s Theorem in section 4.1 of 
[300]. This implies that any finite-dimensional irreducible module of a solvable g is one- 
dimensional, and thus a finite-dimensional representation p will be completely reducible 
iff all matrices p(x) are simultaneously diagonalisable. See Question 1.5.2. 


1.5.2 The structure of simple Lie algebras 


Representation theory is important in the structure theory of the Lie algebra itself, and 
as such is central to the classification of simple Lie algebras. In particular, any Lie 
algebra g is itself a g-module with action x.y := (ad x)(y) = [xy] — the so-called adjoint 
representation. In this subsection we use this representation to associate a Cartan matrix 
to each semi-simple g. 

Consider for concreteness the g = sl,,(C), the Lie algebra of all trace-On x n matrices, 
for n > 2. Let h be the set of all diagonal trace-O0 matrices. Then the matrices in b 
commute with themselves, so is an abelian Lie subalgebra of g. Restricting the adjoint 
representation of g, we can regard g as an (n? — 1)-dimensional h-module. Unlike most 
h-modules, this one is completely reducible. 

In particular, let E(qp) be the n x n matrix with entries (E(qp));; = 5ai5p;, that is with 
0’s everywhere except for a ‘1’ in the ab entry. Since E (ap) E (ca) = SpcE (aa), We get 


[E (ap), E cay] = SpcE (aay — Sad Eleby (1.5.4a) 
Now, a basis for h is Ag = Eva,a) — Ea+1,a+1) fora = 1,...,n — 1. Thus 
[Ax Egal EN (Saa F Ôa+1,c 3 Sac =i ba+1,d)E (ca) (1.5.4b) 


and the basis {E cay }i<e¢a<n U {Aa}i<a<n Of g simultaneously diagonalises all endomor- 
phisms ad A4. In other words, this representation ad h decomposes into a direct sum of 
one-dimensional h-modules. Define functionals &ca) € h* by 


Qcca) (Aa) = Sad T Ôa+1,c ez Sac Ea Ôa+1,d- 
Then we can write 
g= Pi<czdcnCEcca) D span{Ag}i<a<n = BaeoGa (S>) b, (1.5.4c) 


where © = {acay}i<c¢d<n ANd Gog, = CE (ca). The functional a = Qca) is called a root 
because @(A) is the eigenvalue of the operator ad A on the eigenspace CE(.g) and thus 
is a root of the characteristic polynomial of ad A. We avoid calling 0 (the functional for 
b) a root because it behaves differently, for example go = h has dimension n — 1 but all 
other gą have dimension 1. In Section 3.3.1 we identify 0 though as a precursor to the 
so-called imaginary roots of Kac—Moody algebras. 
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From the identity 
(ad A)[xy] = [(ad A)x, y] + [x, (ad A)y] 


(which holds in any Lie algebra), or more concretely from (1.5.4a), we see that the 
decomposition (1.5.4c) defines a grading [ga, gg] E Ga+p, for any roots a, 6 € ®, where 
we put go+g = {0} ifa+ B Z ®. In fact, a little more care verifies that equality always 
holds: 


[Ge, gg] = Ga+B> Va, B € QD. (1.5.4d) 


In Question 1.5.3 you compute the Killing form (1.4.7a). We find that x (E @p)|E(eay) = 
0 unless (d, c) = (a, b), and that « is positive-definite when restricted to the real (n — 1)- 
dimensional space hg spanned over R by A1, ..., An-1- 

The roots a} = (1,2), ---, @n—1 = @n—1,n) form a basis A for the dual space h*, and 
are called simple roots. Explicitly, the root aq) € ® is 


fe J Ae + Ai t+ Aai ifc <d 
eS og — gapi So a ife >d’ 


Note that for each root œ = aap), the elements eg := Evan), fa := Eba), ha t= Aa — Ap 
span a copy of sly. In particular, the ṣl2°s coming from the simple roots œ; generate all 
of sl„,(C), thanks to the grading (1.5.4d). For each a, a; € A, let 


2 ifi=j 
aij =Oj(hg)=4—-1 ifli-jl=1. 
0 otherwise 


This defines a Cartan matrix A. To verify that g(A) is sl,,(C), do calculations such as 


[ec Leu; €o 1] E€ G2aj+aj41 = {0}. 


This analysis continues to hold for any semi-simple g. The space h of diagonal matrices 
becomes any subalgebra of g, all of whose elements x have diagonalisable operator ad x. 
Any maximal such Lie subalgebra is called a Cartan subalgebra. Since almost every 
polynomial has distinct roots, almost every matrix is diagonalisable; for semi-simple g, 
almost every ad x is diagonalisable. A Cartan subalgebra is necessarily abelian. 

Given a Cartan subalgebra h, we get a root-space decomposition 


9 = Dacogu P H (1.5.5a) 


as in (1.5.4c), by simultaneously diagonalising all ad h. The œ € ® C h* are called roots 
as before; the root spaces ga are defined to be the simultaneous eigenspaces 


go = {x €E g| [hx] = a(h)x}. (1.5.5b) 


The gy are always one-dimensional and define a grading as in (1.5.4d). The Killing form 
« is a nondegenerate inner-product, with k(ga|gg) = 0 unless £ = —a. The finite set ® 
of roots is called the root system; the full algebra g can be reconstructed directly from ®. 
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Each g has uncountably many possible Cartan subalgebras. They are related by auto- 
morphisms of g — in fact ‘inner automorphisms’ exp(ad x) (Section 1.5.4) — so they yield 
equivalent root systems ®. Let N (h) denote the set of all inner automorphisms that map 
the space h onto itself, and let C(h) = exp(adh) denote the set of all inner automor- 
phisms that fix h pointwise. Then C(b) is a normal subgroup of N (b), and the quotient 
N(6)/C() of these continuous groups is a finite group called the Weyl group W. It is a 
symmetry of the data of g, as we will see. 

The Killing form identifies h and its dual (this is the raising/lowering of indices familiar 
to any physicist, or transpose familiar to everyone else). We thus get an inner-product on 
the dual space h*, positive-definite on the real span of the roots. For increased readability, 
we write (6|6’) in place of «(6|f’), for B, B’ € h*. The Weyl group W acts on *; in 
particular it is generated by the reflections 


(Bla) 

(ala) 
through each root œ € ® (recall Question 1.2.5). The Weyl group W permutes the roots 
and preserves the Killing form. Each reflection r, fixes the hyperplane orthogonal to a. 
Removing those hyperplanes decomposes h* into connected components, one for every 
element of W. Choose one at random and call it the positive chamber C. 

The Z-span of the roots œ € ® is called the root lattice of g; it is positive-definite, the 
orthogonal direct sum of copies of Z and the lattices An, Dn, E6, E7, Eg of Section 1.2.1, 
all appropriately scaled. The Wey] group is a group of automorphisms of the root lattice, 
normal and of small index in the full automorphism group. 

Let œ1,..., œ, be the roots orthogonal to the walls of the positive chamber C, with 
the sign of each a; chosen so that (@;|C) is positive. Then those a; form a basis A for 
h*, called a base; the a; are called simple roots. Moreover, given any root a € ®, either 
a or —a@ lies in Na; +--- + Na, — we say « is positive or negative, respectively. The 
root-space decomposition (1.5.5a) can be written in the form 


g= ObOSr_, (1.5.5d) 


called a triangular decomposition, where n+ is the sum of the positive (negative) root 
spaces. The grading implies [hh] = 0, [ņn+n+] E n+, [hn+] E n+. Any Lie algebra with 
a triangular decomposition has Verma modules, as we will see [432]. 

Once we have a base A, we get a Cartan matrix A (and hence a Coxeter—Dynkin 
diagram) through the formula 


re(B) = B-2 a (1.5.50) 


gec (a |a;) 
a (CHACHE 


For each simple root a; € A, we get elements e; € go,, fi E Go, hi € h that span a 
copy of sh(C), and together these 3r elements generate all of g. In fact, these are the 
elements referred to in Definition 1.4.5(b), and g is isomorphic to that Lie algebra g(A). 
The cardinality r of any base is called the rank of g. Incidentally, an arrow between 
vertices i, j in a diagram always points towards the simple root of smaller norm. 

Thus we get a Coxeter-Dynkin diagram from g by making two arbitrary choices: a Car- 
tan subalgebra h and a positive chamber C . Different choices are related by symmetries 
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Table 1.4. The simple roots and fundamental weights for the classical algebras 


Algebra Simple root æ; Fundamental weight œw; 
A, ei— eini 1<i<r D ej — EDDA ej 
B, ei — eip, l<i<r e+e +e, l<i<r 

2e, Heite) 
C, VXei— ei) 1Si <r Zet etea) 1sisr 

e 
D, ei —@i41, LSi <r e+- +e, L<si<r-l1 

e, +e, Hetet entene) isr] 


si tet:--+e),i=r 


(inner automorphisms) of g, and the resulting diagram is uniquely determined. This 
is a powerful paradigm: to understand and classify a rigid structure, find and study a 
combinatorial characterisation. Later we apply this strategy to conformal field theories. 

These choices though should disturb the mathematician in us. Perhaps the presence 
of the Weyl group in the following is a hint that we are doing Lie theory badly. Just 
as the vector space ‘symmetry’ GL, is the artificial consequence of choosing a basis, 
so is the Weyl group the bad karma caused by selecting one positive chamber over all 
others. Probably an approach based on Vogel’s universal Lie algebra (Section 1.6.2) will 
ultimately be preferable. 

In any case, we are most interested in the Killing form and Weyl group restricted 
to h*. Given simple roots @;, define fundamental weights w; € h* to be the dual basis 
(w;|a;) = 4;;. They lie on the edges of the chamber C. Their Z-span is the lattice dual 
to the root lattice, called the weight lattice. Denote by P, the intersection of the weight 
lattice with C, so à € P4 if and only if A = )~;_, A;@; where each Dynkin label i; lies 
in N. These à € N, called dominant integral weights, are the r-tuples of Theorem 1.5.1. 

Table 1.4 gives the a; and w; for the classical algebras, using an orthonormal basis of 
R” (R’*! for A,). Nodes are labelled as in Figure 1.17 — this is the labelling used in, for 
example, [328] but not by all other authors. The table makes manifest the Killing form 
on h*, and is useful in the study of affine Kac—-Moody algebras (Section 3.2). More data 
for the simple Lie algebras, including the exceptional ones (avoided here for reasons of 
brevity), can be found in section 6.7 of [328], chapter 7 of [214], and especially pages 
265-90 of [84]. 

The Weyl group of g = sl,,(C) is the symmetric group S, and acts on h* by permuting 
the subscripts: o ya hiwi = XS: hi@si. Figure 1.18 gives the root systems of the semi- 
simple Lie algebras of rank 2. A choice of simple roots is indicated by the numerals ‘1’ 
and ‘2’. In Figure 1.19 a portion of the weight lattices of g = sh (C) and g = sk (C) are 
displayed, along with simple roots and fundamental weights, and the Wey] reflections 
rj = Ta; through the simple roots. Note the S2 = {+1} symmetry of the A, weight lattice, 
and the S3 symmetry of the Az weight lattice. 
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Fig. 1.18 The root systems of the rank 2 algebras. 
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Fig. 1.19 Some of the weights of A, and A>. 


u, 1} Ao, 3 
0,1] Aj,3) Go,7 
0,1] 2m,2]3A,,4| Da, 8 
0,2] A1,5| A2,8| C3, 14] Fa, 26 
0, 11 igh | A2,2| A2,9| A5, 15| E6, 27 
u, 2| A1,4 341,8 C3,14 As, 20| Dg, 32| E7, 56 
Aj, 3| Ao, 8| @,14| Dy, 28| Ep 52| Eo, 78| E7,133 Eg, 248 


Fig. 1.20 Cvitanovi¢’s Magic Triangle. 


The first hint that the exceptional Lie algebras are not especially exceptional (i.e. that 
they fall into a common series) is Freudenthal’s Magic Square (Table 1.3). A second 
is Cvitanovic’s Magic Triangle [126], [129] (Figure 1.20). The clearest example of 
a family of Lie algebras is A,, where in fact the representation rings of smaller A, 
embed in those of the larger (the characters are the Schur polynomials in infinitely many 
variables, appropriately restricted). For example, the formulae L (w1) ® L(@,) = L(@, + 
wr) D L(@e41) and dim L(a,) = (ee) hold for all k and A,, although, for example, 
L(@2) = L(O) and L(@3) = 0 for A;. Something similar (though more complicated) 
happens for the “exceptional series’, i.e. the Lie algebras in the bottom row of the Magic 
Triangle. For instance, the decomposition of various powers g®* of the adjoint modules 


Representations of simple Lie algebras 73 


into irreducibles take the same form (e.g. g ® g = L(0) ® Y2 ® Y% Gg ® X2, where for, 
for example, g = G2, F4, Eg, respectively we have Y) = L(2@,), L(2@,), L(2w7), Y> = 
L(2w), L(2w4), L(@1),g = L(w), L(@ 1), L(@7) and X2 = L(3a2), L(w2), L(we)), and 
the dimension of the adjoint representation is given by the uniform equation dim g = 
2(5hY — 6)(hY + 1)/(hY + 6), where hY is the dual Coxeter number of Section 3.2.3. 
For more examples, see [126], [129] and references therein. 

Note that the exceptional series is nested: 


A; C A2 C G2 C D4 C F4 C Ee C E7 C Eg. 


Taking any pair h C g, the corresponding entry in the Magic Triangle is the centraliser 
c of § in g, and the number there is the dimension of an irreducible module of ¢, unique 
up to outer automorphism, defined by the decomposition of g as a c ® h-module. For 
simplicity Figure 1.20 is watered-down by using Lie algebras in place of Lie groups 
(e.g. the 0’s along the top diagonal are really finite groups) — see [129] for details. This 
exceptional series is explained by Vogel’s universal Lie algebra (Section 1.6.2). 


1.5.3 Weyl characters 


Let g be any complex finite-dimensional semi-simple Lie algebra. The analysis of the last 
subsection on the adjoint representation can be generalised to the other finite-dimensional 
g-modules. Recall the notation introduced last subsection. Let ®t be the positive roots. 
For each a € ®*, choose ey € ga; fa € 9-a and hy € h as before, and write e;, fi, hi 
for these corresponding to the simple root a; € A. Let w; be the fundamental weights, 
as before. 

For all representations p : g —> gl(V) of interest to us, in particular all of the finite- 
dimensional ones, the matrices p(h) for h € h will be simultaneously diagonalisable. 
The analogue of (1.5.4c) is the weight-space decomposition 


V = gex) Vp; (1.5.6a) 


where these functionals 8 € Q(o) C h* are called the weights of p. For example, the non- 
zero weights of the adjoint representation ad g are the roots. For any finite-dimensional 
p, the £ all lie in the weight lattice Zw, + --- + Zw,. These weight spaces 


Vp := {v € V [hv = (hyv Yh € b} (1.5.6b) 


will no longer be one-dimensional in general — the dimension dim Vg is called the 
multiplicity of 6 in p. The grading (1.5.4d) now becomes 


faVg E Ves) eva E Vga. (1.5.6c) 


The weight-space decomposition, or equivalently the weights 6 € &2(p) and their mul- 
tiplicities, uniquely determines any finite-dimensional module (up to equivalence). The 
Weyl group W acts on weights via (1.5.5c), and preserves multiplicities: 


dim Vg =dimV,,,  YWweW, Be QXp). (1.5.6d) 
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Fig. 1.21 The weights of representations of A. 


In Section 3.2.3 we learn that this innocent symmetry (1.5.6d) is a key to the appearance 
of modularity in affine Kac—Moody algebras. 

For an example, recall the Verma module M (A) for s{,(C) constructed in Section 1.5.1. 
Strictly speaking we should write àw; for the highest weight A. This representation has 
weights (A — 2j)w, for j =0,1,2,..., all with multiplicity 1. Moreover, the unitary 
module L(n) = L(nq@,) has weights (n — 2j)@, for j = 0, 1,...,, again all with mul- 
tiplicity 1. The weight-spaces L(n),, are C x(n—m)/2. The Weyl group W = Zz acts here 
by sending iw, to —iw;. See Figure 1.21 for the weights of A,-representations L(3@,) 
and L(4q@,). We label weights in the same Wey] orbit with the same letter. 

Given any functional 4 = )~;_, Aja; € h*, a highest-weight module M with highest 
weight à is a g-module generated by a nonzero vector v € M obeying 


Cy.V = 0, Wa € OF, (1.5.7a) 
h)v=aAjyv, 1<i <r. (1.5.7b) 


Of course by linearity (1.5.7b) implies that hy.v = (A|a)v for all positive roots œ (not just 
the simple ones), and more generally h.v = A(h)v Vh € bh. The module M is generated 
by v in the sense that M is the span of all vectors of the form 


Xa) ee *X(m)-U = Xay-¢ sK (Xm). V) KN ), 


as the vectors xj) range over all of g. This v is called the highest-weight vector. The 
g-modules of greatest interest to us are the highest-weight ones. The name comes from 
the fact that for all u € (A) except u = A, A — u lies in the positive chamber C. 

By the Verma module M (à) we mean the largest or universal or free g-module with 
highest weight à. Any other g-module with highest weight À can be constructed from this. 
To make this more precise, we first define the analogue here of the group algebra CG. 

As we know, a basis for g is ew, fa for all positive roots a € &*, together with the 
elements h;. The universal enveloping algebra U (g) is the largest associative algebra 
generated by those ||®|| + || A|| symbols e,, fa, hi, which obey all identities of the form 
xy — yx = [xy] for all x, y € g. More precisely, U(g) is the quotient of the free asso- 
ciative algebra on those ||®|| + || A|| symbols, with the ideal generated by all elements 
xy — yx — [xy]. The starting point for the theory of U (g) is: 


Theorem 1.5.2 (Poincaré — Birkhoff—-Witt) A basis for U (g) is the set of monomials 


(me) (Te) (1), 


for all choices of integers Ma > Q, na = O, pi = 0. 
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The basis element corresponding to my = Ny = pi = 0 is denoted 1. The associative 
algebra U (g) is not commutative, so to define the products Į |, we must make some arbi- 
trary ordering of the positive roots ®t — it doesn’t matter how we do this. The Poincaré- 
Birkhoff—Witt Theorem holds for any Lie algebra (not necessarily semi-simple). See the 
proof and discussion in chapter III of [348]. That those monomials span U (g) is clear; 
more difficult is to show that they are linearly independent. 

In Section 6.2.3 we use U(g) to construct guantum groups. Here what is significant 
is that its representation theory is identical to that of g. This isn’t deep: the matrices 
p(x), for x € g, generate an associative (matrix) algebra. Thus we have replaced the 
task of finding modules of the non-associative algebra g with the simpler but equiva- 
lent task of finding modules of the associative (though infinite-dimensional) algebra 
U(g). The relation between g and U(g) is quite analogous to that between G and 
CG, except that CG is somewhat simpler due to G already having an associative 
product. 

Let J (à) be the left-ideal of U (g) generated by all e, and all h; — 4,1. This means 


JA)= [Sse +9 Yi hi = ADI aa Yi € vo] : 


The Verma module M (à) can now be defined to be the quotient of U (g) by J (à). It is 
a (left) U(g)-module, and hence a g-module. By the Poincaré—Birkhoff—Witt Theorem, 
the infinite set of elements of the form 


Vim) = (r p) v, (1.5.8a) 


for all integers ma > 0, forms a basis for M(A). The action of ey, fa, hi € g on these 
vectors Vim} is obtained using the commutation relations of g together with (1.5.7). 
In particular, we find that vim) is an eigenvector for all operators h;, and corresponds 
to weight A — 7, maa. Thus the weight-space decomposition of the Verma module 
M(A)is 


MA= Q Mh, (1.5.8b) 


a'eENait+--+Na, 


where M (à)x—o has basis consisting of all vim} with a’ = dene My. 

The Verma module M(A) is indecomposable but may or may not be reducible (see 
Question 1.5.5). The general way to handle modules that aren’t completely reducible 
is to use quotients, exactly as we did with the composition series for finite groups. 
In particular, M(A) always has a unique maximal submodule K (A) Æ M(A), and for 
it the quotient L(A) := M(A)/K (A) is irreducible. More generally, every U (g)-module 
with highest weight A can be obtained by quotienting M(A) by some submodule; the 
quotient L(A) can thus be regarded as the smallest U(g)-module with highest weight 
A, and is the module in which we are primarily interested. In particular, the finite- 
dimensional irreducible modules named in Theorem 1.5.1 are precisely these quotients 
L(A). 
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Fig. 1.22 The weights of modules of A. 


This maximal submodule K (A) is the space of all null-vectors. For dominant integral 
weights à € P4, it is the span of all vectors of the form 


(1 fe) fot ly, 


for any choice of integers cy € N, and any i. 

Figure 1.22 gives the weights for A.-modules L(@; + @2), L(2@,) and L(2@2). We 
denote a weight B = >>; iœ; by its Dynkin labels B; € Z. All multiplicities in Fig- 
ures 1.21 and 1.22 are 1 except for L(@; + @2)@,0), which has multiplicity 2. Incidentally, 
L(q@, + @2) is the adjoint representation, while L(2@,) and L(2m>) are contragredient. 

As usual, it is hard to compare modules directly: p and p’ could be equivalent (i.e. 
differ merely by a change-of-basis) but look very different. Or given some module, we 
may wish to decompose it into the direct sum of irreducible modules L(A). For finite 
groups, we use characters to clarify their representation theory, projecting away the 
extraneous basis-dependent details; Weyl showed that something similar works here. 

The character of a g-module V , with weight-space decomposition (1.5.6a), is 

chy (z) := ss dim Vz eF), (1.5.9a) 
BEV) 


for any z € h. If we coordinatise h and h* by z = }_; z;h; and B = J`; Bjw;, we can use 


7 2 
B(z) = Bizi ———. (1.5.9b) 
2 (a; |a;) 
For example, for the A;-module L(n@,) we find 
n 4 e@thz - eo tbz 
shigej ch) => = a (1.5.10a) 


i=0 o ae 


where we obtained the formula on the right by summing the geometric series. Note that 
its numerator and denominator are alternating sums over the Weyl group S2 of A,. By 
comparison, the character for the Verma module M (A) is 


e 


EE (1.5.10b) 


CO 
chyao,)(zh) = by e072 — 
= 


l 
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More generally, the Verma module M (à) for g has character 
er) 
[genre — 670) 


The Weyl character formula expresses the character of any finite-dimensional irre- 


(1.5.10c) 


chuo (z) = 


ducible module L(A) for any semi-simple g as a fraction: the numerator is an alternating 
sum over the Weyl group W , and the denominator is a product over positive roots € AT. 
More precisely, 


J wew det(w) eWAt pz 
Teea — e~#2)) ; 


where p = )~;_, œ; here is the Weyl vector. For a proof see, for example, chapter 14 
of [214]. This formula and its generalisations have profound consequences (see Sec- 
tion 3.4.2). 

Finite groups have only finitely many irreducible modules, while Lie algebras have 
infinitely many. Otherwise their theory is quite analogous, and in particular Lie algebra 
characters work as effectively as finite group characters. 


ch, (z) = chy a)(z) =e ?? (1.5.11) 


Theorem 1.5.3 Let g be a finite-dimensional semi-simple Lie algebra, and M, N 
two finite-dimensional modules. Then chy(z) = chy(z) for all z € ġ iff M and N 
are equivalent as g-modules. Moreover, chye@n(zZ) = chy(z) + chy(z), chyen(Z) = 
chy (z) chy (z) and chy+(z) = chy (Z). 


As before, the characters are also enormously simpler than the modules themselves: 
for example, the smallest nontrivial representation of g = Eg is a map from C? to 
the space of 248 x 248 matrices, while its character is a function C? —> C. But why is 
Weyl’s definition (1.5.9a) natural? How did he come up with it? 

He used the relation with groups. Consider for concreteness g = A,. Given any rep- 
resentation p, the map e* +> e?® is a representation of the Lie group G = SL,+,(C) 
corresponding to g (the exponential e^ of a matrix is defined by the usual power series, 
and always converges). The trace of the matrix e?) is the group character value at 
e* € G, so we define it to be the algebra character value at x € g. Again, it suffices to 
restrict to representatives of each conjugacy class of G, because the character is a class 
function. Now, almost every matrix is diagonalisable (since almost any n x n matrix has 
n distinct eigenvalues), and so we shouldn’t lose much by restricting x € g to diagonal- 
isable matrices. Hence we may take our conjugacy class representatives to be diagonal 
matrices x € g, i.e. tox € b. So the Lie algebra character can be chosen to be a function 
of z € b. Finally, the trace of the matrix e?°” is the sum (with multiplicities) of its eigen- 
values, which gives us (1.5.9a). This is the intuition behind Weyl’s definition (1.5.9a) of 
character. 

However, different diagonal matrices can be conjugate. For instance in Aj, 


CeCe Geese) 
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so e7” and e~7" lie in the same G = SL2(C) conjugacy class and chy(z) = chy(—z). 
This z + —z symmetry is the Weyl group action on the Cartan subalgebra h = Ch. Each 
character chy) of any semi-simple g is similarly invariant under the Weyl group of g, 
thanks to (1.5.6d). 

At first glance, it may seem that the Weyl character formula (1.5.9a) is not very 
practical, at least for large rank. For instance, the numerator of (1.5.9a) for Eg would 
involve an alternating sum over the Weyl group, which has about 700 million elements! 
On the other hand, one alternating sum is very easy to compute: a determinant is an 
alternating sum over the symmetric group (the Weyl group of the A-series). Since all 
Weyl groups have symmetric subgroups of relatively small index, the numerators and 
denominators of (1.5.9a) actually can be computed quite effectively. 

It is common practise in physics to use dimensions to specify irreducible modules. 
For example, the defining representation of sl,(C) is denoted 3, and its contragredient 
by 3. This is a terrible habit, as many unrelated modules can have identical dimension. 
For instance, sl; has six different irreducible modules with dimension 175: namely L(A) 
witha = (1, 2, 0, 0), (1, 1, 0, 1), (0, 3, 0, 0) and their contragredients (0,0,2,1), (1,0,1,1), 
(0,0,3,0). The practise should rather be to use highest weights, not dimensions, when 
labelling finite-dimensional modules. 


1.5.4 Twisted #1: automorphisms and characters 


A fundamental theme of this book is twisting by automorphisms. As we see later, it is 
central to conformal field theory and string theory, as well as vertex operator algebras, 
and is implicit in the definition of the McKay—Thompson series Ty. Its role in finite- 
dimensional Lie theory is more elementary, but can be regarded as a toy model for 
several of this book’s most important subsections. 

Let g be any Lie algebra over C. An automorphism y of g is an endomorphism (i.e. 
an invertible linear map y : g —> g) that obeys 


yix, y] = [yx, yy], Vx, y eg. 


Write Aut(g) for the group of automorphisms of g. When y € Aut(g) has order n < 00, 
it is diagonalisable on the space g (why?). Hence we can write g as a direct sum 


g = ig (1.5.12a) 
of eigenspaces of y, where 
yx =£fx, Wr Ee gy (1.5.12b) 


(as always, &, denotes the root of unity exp[27i/n]). Because y is an automorphism, 
(1.5.12a) defines a Z,-grading on g, in the sense that 


(gx. Ge] E Gere. (1.5.12c) 


The y-invariant space go is a subalgebra of g, and the other subspaces gg are go-modules. 
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For example, let g = sh (C) and choose the usual basis e1, e2, e12 := [e1e2], fi, f2, 
fio := [fi f2], hi, h2. There is an order-2 automorphism y of sl, corresponding to the 
left-right symmetry of the Ay Coxeter-Dynkin diagram. It exchanges e; and e2; therefore 


ye = [yei yez] = [e2e1] = —e12. 


Continuing in this way, we find that y exchanges fı and f2, as well as hı and ho, and 
sends f\2 to — fiz. Thus 


Go =span{e; + e2, fi + fo, hi + h2}, 
gı =span{e; — e2, €12, fi — f2, fiz, hi — ha}. 


The reader can verify that the Lie subalgebra go is isomorphic to sh (C), while gj is the 
irreducible five-dimensional A-module. 

Every Lie algebra has nontrivial automorphisms. For instance, let x € g be such that 
the operator ad x on g is nilpotent, that is there is some integer k such that 


(adx)*y := [x[x---[xy]---]]=0, Vy eg. 


For instance, any x = e; orx = f; works when g is semi-simple. Then exp(ad x) (defined 
by the usual power series expansion) is a well-defined invertible operator on g and is 
in fact an automorphism. These automorphisms exp(ad x) together generate a normal 
subgroup of Aut(g) called the inner automorphisms of g. The quotient of Aut(g) by the 
inner automorphisms defines a group called the outer automorphisms. 

For example, for g = sl,,(C) the inner automorphisms form a group isomorphic to 
PGL,(C) = GL, (C)/{C* In}, and the group of outer automorphisms is Z2 for n > 2 
(and {1} for n = 2). The outer automorphism takes a matrix x € sl,(C) to —x'. 

As an aside, the group of inner automorphisms of a simple Lie algebra over C is 
always a simple group (though infinite). It could be hoped that the same would be true if 
instead we consider Lie algebras over a finite field F4. Indeed this is the case (except for 
five small counterexamples, involving the fields Fz and F3). This gives rise to nine of the 
infinite families of finite simple groups of Lie type (Section 1.1.2); the seven remaining 
ones are various twists of these groups. 

Given any two Cartan subalgebras b1, h2 of a simple algebra g, an inner automorphism 
can be found mapping þh; to hz (we say the inner automorphisms act transitively on the 
set of Cartan subalgebras). Moreover, for any choice of Cartan subalgebra b, if we 
take the subgroup of inner automorphisms mapping b to itself, and quotient it by the 
subgroup of inner automorphisms fixing h pointwise, then we get the Weyl group of g. 
This means that (modulo an inner automorphism) an automorphism of g permutes the 
simple roots; conversely, this permutation uniquely determines it. In other words, the 
outer automorphisms for semi-simple g are in a natural one-to-one correspondence with 
the symmetries of the Coxeter-Dynkin diagram. These are the most important choices 
of automorphisms, for our purposes, as the fixed-point subalgebras go are maximally 
large. 

In particular, the fixed-point subalgebra go for g = shn, when y is taken to be the outer 
automorphism permuting e; and e2,_; (the order-2 diagram symmetry), is isomorphic 
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to SPa, = Cn. Likewise, taking g to be (respectively) A2,4;, Dn, D4 and E6 and taking 
y to be the diagram symmetry of order 2, 2, 3 and 2, yields a fixed-point subalgebra go 
isomorphic to 502,41 = Bn, $021 = Bn-1, G2 and F4, respectively. 

The automorphism group Aut(g) permutes the g-modules, through the formula 


p(x) = p(yx). 


Sometimes p” and p are isomorphic as g-modules. In this case there is a matrix A € 
GL(V ), where V is the underlying space of p and p”, such that 


p(yx) =A 'p(x)A, Vx € g. 


Let us assume for convenience that p is irreducible. Then by Schur’s Lemma this matrix 
A will be well defined up to a scalar multiple. 

In fact, for g semi-simple and y corresponding to a diagram symmetry, and p the 
module L(A), p” will be the module L(yA), where y acts on weights by permuting 
Dynkin labels. Thus o” = p iff yA = à. In this case there is a canonical choice of 
matrix A, sending weight-space L(A), to weight-space L(A),g, given by 


Cm, °° Omg VA > eym, ++ Cymy-Va- 


Recall Thompson’s trick: twisting the graded dimension (0.3.2) to get the McKay- 
Thompson series T, of (0.3.3). Here, this becomes the y -twisted or twining character 


chy (h) := try A exp[p(h)] = Sy tr(Ag) expl £(h)], (1.5.13) 
B=yB 
where we can restrict the sum to all weights 6 € QQ(L(A)) that are fixed by y, and where 
Ag is the restriction of A to the weight-space L(A)g. The term ‘twining’, introduced 
in [213], is short for ‘intertwining’. In terms of the basis (1.5.8a) for the weight-spaces 
L(A)g, Ag is a permutation matrix when £ is fixed by y (only these £ survive in (1.5.13)). 
For example, consider first g = D4 and y the diagram automorphism interchanging 
the third and fourth nodes. The dominant weight à = (1, 0, 0, 0) is invariant under y. 
The D4-representation L(A) is eight-dimensional, with all weight-spaces L(A), having 
dimension 1. It is thus easy to compute the twisted character ch : it has a term with coef- 
ficient 1 for each y-invariant weight 6 = (+1, 0, 0, 0), (0, £1, 0, 0). For a more com- 
plicated example, consider D4 again, but with the order-3 automorphism (‘triality’) and 
the invariant dominant weight A = (0, 1, 0, 0): this Da-representation is 28-dimensional 
but only its weights 6 = (0, +1, 0,0), +, —1, 1, 1), +d, —2, 1, 1), (0,0, 0,0) are 
triality-invariant. Of those, the weight-spaces are all one-dimensional except for 
L(O, 1, 0, 0)(0,0,0,0), which is four-dimensional. A basis for that weight-space consists 
of 


Bhfhhhvw, fppphhahv. fpbhfahrv. fafs fafifzv- 


The map A(o,0,0,0) cyclically permutes the first three basis vectors, but fixes the fourth. 
Thus the twisted character has seven terms, each with coefficient 1. For similar calcula- 
tions with small-rank algebras, the concrete bases given in [383] are useful. 
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If we restrict to h in the Cartan subalgebra of the fixed-point subalgebra go, the result 
will lie in the character ring of go. Thus the twisted character ch is a virtual character for 
the fixed-point subalgebra go, that is, it is a linear combination over Z of true characters. 
However, chy itself need not be a true character of go. 

For example, recall the example g = D4, weight A = (1, 0, 0, 0), and y interchanging 
nodes 3 and 4. Then the fixed-point subalgebra go is B3 and the twisted character chy 
is the virtual B3 character chë o,o — ché.0,0) (Question 1.5.8(a)). On the other hand, the 
other D4 example has fixed-point subalgebra G2, and the twisted character equals the 
true character L(0, 1). 

Surprisingly, chy is always a true character for the algebra Io. obtained by reversing 
the arrows in the diagram of go. g° is called the orbit Lie algebra in [213]. 

For example, when g = D4 and A = (1, 0, 0, 0), we find (Question 1.5.8(b)) that the 
twisted character ch 0,0,0) equals the character che 0,0) of the orbit Lie algebra go- 

More generally, we find: 


Theorem 1.5.4 [213] Let g be semi-simple and finite-dimensional, and let y be the 
automorphism of g corresponding to a Coxeter-Dynkin diagram symmetry. Let à € 
P(g) be any dominant integral weight fixed by y. Then the twisted character chy, 
defined in (1.5.13), restricted to the Cartan subalgebra of the fixed-point subalgebra go, 
is a virtual character of go and a true character xz of the orbit Lie algebra Bes for some 
T € P(g”), 


A weight à € P (A2) fixed by the order-two diagram symmetry looks like A = 
(A1,--+5 Ans Àn, +++, A1); likewise, à € P4(A2n_1) fixed by the order-two diagram sym- 
metry looks like A = (à14, ..., An—1, Àn, Àn—1; -- -, A1); while a weight à € Ps (Dyj41) 
fixed by the n — 1 <> n diagram symmetry looks like à = (Aj, .. ., An, ån). The orbit 
Lie algebra go” here is C, for Az, or Dn+1, and B, for Azn—1. In all three cases, À has 
Dynkin labels (à14, ..., An). 

The proof of Theorem 1.5.4 follows that of the Weyl character formula. Although 
Theorem 1.5.4 is not itself important for us, the obvious generalisation holds for affine 
algebras (Theorem 3.4.1), and provides a striking special case of the important orbifold 
construction in string theory and vertex operator algebras. 

In hindsight it is easy to see that go is the more natural algebra: for modules, h* is 
more relevant than since that is where the weights live. Consider, for example, D4 
again, with diagram symmetry 3 <> 4. Then a y-invariant weight looks like 6,@; + 
B22 + £3(@3 + w4). Using Table 1.4, we see that these vectors {w1, @2, @3 + wa} have 
the same inner-products with each other that the fundamental weights of C3 have (up to 
a global factor of 2, which is merely conventional). 

Incidentally, some version of these remarks holds for finite groups. Let y be an 
automorphism of a finite group G; then y permutes the irreducible representations 
of G, p+» poy, as before. Choose any irreducible representation p = p o y and let 
A be the isomorphism. The y-twisted character of p is the trace ch’,(g) := tr A p(g). 
It won’t be a class function of G — for example, for the inner automorphism g +> 
h-'gh, ch? (g) = ch,(hg). But this calculation shows that it suffices to consider outer 
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automorphisms. In particular, diagram automorphisms of finite reductive groups should 
be interesting in this context. 


1.5.5 Representations of Lie groups 


We are more interested in (complex) Lie algebras, but (real) Lie groups do occasionally 
arise. Once again, it is their representation theory that is of greatest interest to us. 

Let G be a real finite-dimensional Lie group, and let H be a complex Hilbert space. 
Let B(H) be the group of bounded linear operators with bounded inverse — boundedness 
is equivalent to continuity (Section 1.3.1). A representation or module of G on H is a 
homomorphism z : G —> B(H) such that the map G — H, defined by g bh z(g)u, is 
continuous for every v € H. We call two modules zr, 2’ equivalent if there is a bounded 
operator A : H —> H’, with bounded inverse, such that A~'z’(¢)A = m(g) forallg € G. 
The module z is unitary if each operator 7 (g) is unitary, that is surjective and 


(1(g)v, m(g)v') = (v, v’), Vu,v EH. 


The module z is irreducible if there is no closed nontrivial subspace V, such that 
m(g)V C V for all g e G. Most important are the irreducible unitary modules, and 
these together form a topological space called the unitary dual G of G. 

For example, all one-dimensional modules of the additive group G = R are of the 


1 
form x > e'* for any a € C; it will be unitary iff œ € R. The map x > 0 J is 


a representation of R that is not irreducible (consider V = C x {0} c C? = H). The 
one-dimensional modules of the group G = S! are e? +> e° for n € Z, and all are 
unitary. The unitary duals of R and S! are R and Z, respectively. 

Continuity is an important requirement. For instance, let {bg}geg be a basis for R 
treated as a vector space over Q (so B is uncountable). Then for any choice of complex 
numbers ag, the assignment >> pr pop te II B e% defines a (rather chaotic) group (for 
rg € Q) homomorphism R — C%. Continuity of x is needed in order to obtain from z 
a module of the Lie algebra g of G. 

Call a vector v € H smooth if g œ> z (g)v is a smooth function from G to H. The 
space Hæ of smooth vectors forms a dense G-invariant subspace of H; if H is finite- 
dimensional, Hæ equals H. Recall that the Lie algebra g is the tangent space T,G, and 
the exponential map exp sends g to G. For any v € H% and x € g, define 


ô = 2 ix 1.5.14 
m(x)v = yee Ww), 0: (1.5.14) 


This defines a g-module on H» called the derived module. Of course, a (complex) module 

of the real Lie algebra g lifts to a complex module of its complexification gc := C Qp g. 
The theory simplifies enormously if G is compact (for simplicity we also assume 

connectivity). Then G is a subgroup of the unitary group U,,(C). Moreover: 


Theorem 1.5.5 (Peter-Weyl) Let G be a connected compact finite-dimensional Lie 
group. Any module x of G is equivalent to a unitary one, is completely reducible, and 
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ôr is a module of the reductive Lie algebra gc = C ® g. Any irreducible G-module is 
finite-dimensional, and the derived module for gc is also irreducible as a Lie algebra 
module. 


The unitary dual G is thus a countable discrete space. The key to proving Theorem 1.5.5 
is that it is possible to average (integrate) over the group. This G-invariant Haar measure 
plays the role here of the ubiquitous }~ geg in the finite group theory. For example, a 
G-invariant Hermitian form on 7 is obtained by averaging any given Hermitian form 
over its translates — compactness of G is needed to show that integral converges. See, 
for example, chapter II.9 of [92] for an elementary proof of Theorem 1.5.5. 

If G is simply-connected as well as compact and connected, then any irreducible 
module of gc lifts to one of G. Otherwise, G = G /Z, where G is the universal cover 
and Z is some discrete subgroup of G (Theorem 1.4.3), and a gc-module will lift to one 
on G iff, once it is lifted to G, it is trivial on Z. If it isn’t trivial on Z, it would be a 
projective representation for G (Section 3.1.1). PS 

An elementary example of this is provided by the modules of R = S! and S! = R/Z, 
given earlier. More interesting is to compare the universal cover SU2(C) of the group 


a —1 
SO; (R) = SU2(C)/ 0 1 
irreducible modules correspond to highest weights à € P} = {0, w1, 2a), ...}. Each of 
these exponentiates to an irreducible module of SU2(C). In particular, the SU2(C)-module 


corresponding to highest weight à = na, can be realised as the space of homogeneous 
polynomials p(z1, z2) of degree n, with SU2(C) action given by 


J) Their complexified Lie algebra gc is sh (C), whose 


b 
c A < pi, z2) = paz +cz2, bz) + dz2). (1.5.15) 


—1 
This will be a module of SO3(R) iff ( 0 acts trivially, ie. iff p(z1, z2) = 


0 -l 
P(—2Z1, —2Z2) for all p, i.e. iff n is even. Physicists call n/2 the ‘spin’, and the mod- 
ules with n odd are called ‘spinors’. See, for example, chapter 20 of [214] for more 
on this. More generally, the dominant weight à = Ya iwi E P} gives a module of 
PSL, (©) = SL, (C)/Z, iff n divides 7 ià;. 

Let G be any compact simply-connected connected Lie group, and g its (real) Lie alge- 
bra. The simply-connected connected complex Lie group associated with the complex 
Lie algebra gc is called the complexification Gc of G. For example, the complexification 
of SU,(C) is SL,(C). Weyl’s unitary trick says that the irreducible modules of G, g, gc 
and Gc are all in natural bijection, using the derived module, complexification of the 
algebra module, ‘exponentiation’ of an algebra module to a simply-connected Lie group 
and restriction. Depending on the context, it is sometimes more convenient to look at the 
modules of G, gc or Gc. 

All of the irreducible modules of a compact connected Lie group G are constructed 
explicitly by the Borel—Weil Theorem. It suffices of course to consider simply-connected 
G. Take G = SU,,(C) for concreteness. Let B be the upper-triangular matrices in Gc = 
SL, (©). It is called the Borel subgroup and is a maximal solvable subgroup in Gc. Given 
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a dominant integral weight A = }°A;@;, put t = Ay + 2A. +--+» + (n — 1)àn-1 and 


n—1 


n-l 
1 1 1 1 
= ài — =t, Aj — —t,..., Àn-1 — —t, ——t | € R”. 
p= (Puin au = eiaa = der) 


i=2 
Let T (à) be the space of holomorphic functions f(g) on Gc (regarded as a complex 
manifold) such that 


bi x * 
f(gb) =b bef,  YgeGce b=[ 0 > « |- (1.5.16) 
0 0 b, 


Then this is a Gc-module (namely, one induced from a one-dimensional B-module), 
and it is easy to identify its weights since the maximal torus T (the exponentiation of 
the Cartan subalgebra h of gc, i.e. the diagonal determinant-1 matrices) is contained in 
B: we find that T (À) is the contragredient of the highest-weight representation V (À). 
From this picture, the Weyl character formula arises through fixed-point formulae for 
the Gc-action on Gc/B [83]. 

The geometry of this construction is quite pretty (see e.g. section 23.3 of [219] or [83]). 
Geometrically, the space Gc/B = G/T is a flag variety whose points are the various 
choices 0 C Vj C +++ C Vn_-1 C C” of subspaces, where dim V; = i. Then F(A) is the 
space of holomorphic sections of a line bundle Gc xg C on Gc/B naturally associated 
with à. Similar comments apply to any other G. Something similar happens for the 
Virasoro algebra, where the flag manifold is replaced by the moduli space of curves 
(Section 3.1.2). 

As discussed in Section 1.1.3, the natural analogue of the group algebra for a Lie 
group G is the space L7(G) of functions f : G > C, with convolution product. The 
main importance of these spaces of functions is that they are natural G-modules, using 
right translation: (h. f Xg) := f (gh). For example, consider G = S!, so f € L?(S!) can 
be regarded as a function f(x) with period 27 . We find that L? (S1) decomposes into the 
infinite direct sum 


L*(S') = Srez V (n) 


of irreducible one-dimensional modules V (n). More precisely, L?(S!) will be a com- 
pletion of this algebraic direct sum. This means that any ‘vector’ f € L?(S!) can be 
written as Drez fn Where each summand f, € V (n). Now, V (n) consists of those func- 
tions fa on which e® € S! acts as (e. fr)(x) := frx +y) = eD” fa(x)—in other words 
falx) = cn e”* for some complex number c,. Using the orthogonality of the e!”*, we 
can explicitly construct the projection operator L?(S!) —> V (n), and we find 


ig 
= fœ)e™ dx, 
20 0 
which we recognise as the Fourier transform fin) of f. 
More generally, for arbitrary compact G, the Peter-Weyl Theorem tells us that the 
matrix entries 7 (g);j of the irreducible representations of G are dense in the space 
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of functions on G. More precisely, the Fourier transform associates with a function 
f € L*(G), a matrix-valued function f (7) on the unitary dual G, defined by 


jae [ Penide, 


where as usual we’re using the Haar measure on G, normalised so that the volume of G 
is 1. Then for any f € L?(G), 


f(g) = Y dimz t (forF). 
mEG 
As is familiar from the abelian case, the convolution product is sent to the ordinary 
(matrix) product: fixh (1) = Fi (m) h). We also get a unitary isomorphism between 
L?(G) and what we can call L(G) (the space of these matrix-valued P, called the 
Plancherel formula: 


f IFO ag = Y (dim r) te (Fofo). 
G neG 

The representation theory of noncompact Lie groups is completely different. This can 
already be seen for the additive group G = R, which has a continuum of irreducible 
unitary modules (namely e'** for all œ € R). The unitary dual G can involve both con- 
tinuous and discrete parts, and can have a wild topology. Once again, a unitary module 
is completely reducible into irreducible unitary ones, but for a general noncompact G 
a direct integral (Section 1.3.1), rather than a direct sum, will be needed, and for wild 
groups the uniqueness of this decomposition will be lost. 

Any connected Lie group is (up to central extensions) the semi-direct product of 
a solvable Lie group with a semi-simple Lie group — this is the Levi decomposition 
(see e.g. appendix B in [348]). The representation theory of solvable groups is quite 
well understood, using the orbit method. It relates the unitary dual to certain orbits of 
G on the dual g* of the Lie algebra g of G (see [346] for an excellent introduction, 
although section 2 of [563] may be more accessible to physicists). Physically, this is just 
geometric quantisation: G is a symmetry of a physical system; the classical phase space 
is a symplectic manifold on which G acts (these are essentially the coadjoint orbits); 
quantum mechanically we would like this to correspond to a Hilbert space carrying a 
unitary representation of G. Geometric quantisation tries to do for quantum theories what 
the symplectic geometry of Hamiltonian mechanics does for classical ones: provide an 
elegant and natural mathematical formulation. 

The effect of the semi-direct product on the unitary dual is also under control. How- 
ever, the representation theory of the (noncompact real) semi-simple groups is poorly 
understood. See [349] for a modern review. 

For example, the Heisenberg group H consisting of all matrices 


b 


1 
0 cl, Va,b,c ER, 
0 1 


O = Q 
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is simply-connected and solvable. Its irreducible unitary modules are given in Theo- 
rem 2.4.2 below, and we can naturally identify its unitary dual with the x y-plane in R? 
together with the z-axis. On the other hand, SL2(R) is a semi-simple noncompact group, 
topologically equivalent to the interior of a solid torus; its unitary irreducible modules 
are described in Section 2.4.1, and its unitary dual consists of three one-dimensional 
families (the principal, spherical principal, and complementary series) and a countable 
family (the discrete series). 


Question 1.5.1. Interpret the trigonometric identities given at the beginning of this sec- 
tion, in terms of the character theory of A}. 


Question 1.5.2. Classify all two-dimensional representations of the abelian Lie algebra 
g = C’. Which of these are completely reducible? 


Question 1.5.3. Let g = sh, (C). From first principles, compute the Killing form 
k(Aaļ| Eca), K(AalAp), K (Eal Eca). 


Question 1.5.4. In effect, Question 1.4.7 defines a representation g of sl3(C). 

(a) Find the weight-space decomposition of this representation of $[3(C), as well as the 
corresponding character. 

(b) Find the root-space decomposition of sl3(C), i.e. the weight-space decomposition of 
the adjoint representation of sl, (C). Also compute the character. 


Question 1.5.5. Recall the Verma modules M (à) for A; constructed in Section 1.5.1. 
(a) Prove that each M (à) is indecomposable (i.e. cannot be written as the direct sum of 
two submodules). 

(b) When à ¢ N, prove that M(A) is irreducible. Thus L(A) = M (à) for these 2. 

(c) Whena = n € N, find all submodules. Verify that the maximal one has highest weight 
vector Xn+1- 


Question 1.5.6. Let g = sl(C). 

(a) SetC := ef + fe+ sh? € U(g). Show that C is in the centre of U (g). (C is called 
the quadratic Casimir of g; there is an analogue for any semi-simple g.) 

(b) Given any irreducible module z of g, prove that Z := 27(f)m(e) + z(h) + ix (hy? 
is a scalar multiple of the identity. 


Question 1.5.7. Let G = SU(2). Then g = sl,(C) (which is the complexification of the 
Lie algebra of G) acts naturally on the space C™(G) of all smooth complex-valued 
functions on G. In particular, g can be identified as the space of all left-invariant first- 
order differential operators. Prove that U(g) can be identified with the space of all 
left-invariant finite-order differential operators on C™(G). 


Question 1.5.8. (a) Verify the claim in Section 1.5.4 that g = D4 with L(1, 0, 0, 0) has 
twisted character restricting to B3-character chi 0,0, — chyo,o,0) and C3-character chi 0,0). 
(b) Repeat this calculation for g = A4 and A = (1, 0, 0, 1). 


Question 1.5.9. (a) In Section 1.5.5 we gave a module of SU2(C) using degree n poly- 
nomials. Find the derived module for the Lie algebra sh (C), find its weight-spaces, and 
prove the equivalence with L(nw). 

(b) Work out the Borel—Weil representation T (à) for SU2(C), for any à = nœ, n € N. 
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1.6 Category theory 


The only difficulty in understanding categories is in realising that they have no real 
content. They’re just a language, highly abstract like the more familiar set theory, but 
one that can be both natural and suggestive. It tries to deflect some of our instinctive 
infatuation with objects (nouns), to the mathematically more fruitful one with structure- 
preserving maps between objects (verbs). 

Category theory is intended as a universal language of mathematics, so all concepts 
should be translated into it. Much as beavers, who as a species hate the sound of running 
water, plaster a creek with mud and sticks until alas that cursed tinkle stops, so do category 
theorists devise elaborate and obscure definitions in an attempt to capture a concept that 
to most of us seemed perfectly clear before they got to it. But at least sometimes this 
works admirably — for instance no one can be immune to the charm of treating knot 
invariants with braided monoidal categories. 


1.6.1 General philosophy 


A category C consists of two kinds of things. One are the objects, and the other are the 
arrows (or morphisms). An arrow, written f : A —> B, has an initial and a final object 
(A and B, respectively). We let Hom(A, B) denote all arrows A —> B in the category. 
Arrows f, g can be composed to yield a new arrow f o g, if the final object of g equals 
the initial object of f. Maps between categories are called functors if they take each 
object (respectively, arrow) of one to the objects (respectively, arrows) of the other, and 
preserve composition. A gentle introduction to the mathematics of categories is [370]; 
the standard reference is [397]. 

The standard category is called Set, where the ‘objects’ are sets, and the arrows from A 
to B are functions f : A —> B. Many algebraic categories are of that form, with objects 
being sets with certain structure, and the arrows being structure-preserving maps. A 
typical example is Vect, where the objects are vector spaces over some fixed field and 
the arrows are linear maps. A rather trivial example of a functor F :Vect — Set sends a 
vector space to its underlying set - F simply ‘forgets’ the vector space structure on V 
and ignores the fact that the arrows f in Vect are linear. 

Geometric categories often employ the idea of cobordism. For instance, fix a manifold 
M; let the objects be points p € M, and the arrows p — q be homotopy equivalence 
classes of paths o in M from p to q. Composition of arrows is given by (1.2.5). This cat- 
egory is called the fundamental groupoid of M — note that Hom(p, p) = 7(M, p). 
A higher-dimensional example is called Riem: its objects are disjoint unions of 
(parametrised) circles §', and the arrows are (conformal equivalence classes of) cobor- 
disms, that is (Riemann) surfaces whose boundaries are those circles. Composition of 
arrows in Riem amounts to sewing the surfaces along the appropriate boundary circles. 
A final example of a geometric category is Braid: its objects are any finite number 
(possibly 0) of ‘hooks’, Hom(m, n) is empty unless m = n, in which case the arrows are 
the n-braids 6 € B,. Such categories, where arrows consist of equivalence classes, are 
called quotient categories [397]. 
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Fig. 1.23 The definition of product and sum. 


For a baby example of the translation of the familiar into category theory, consider 
the usual definition of a one-to-one function: f(x) = f(y) only when x = y. Category 
theory replaces this with the right cancellation law: call an arrow f : A —> B ‘one-to- 
one’ if for any object C and any arrows g, h e Hom(C, A), f o g = f ohimplies g = h. 
The reader can easily verify that in Set this agrees with the usual definition. What does 
this redefinition gain us? It certainly doesn’t seem any simpler. But it does change the 
focus from the argument of f to the global functional behaviour of f, and a change 
of perspective can never be bad. It allows us to transport the idea of one-to-one-ness to 
arbitrary categories. For instance, in the category Riem, all arrows are ‘one-to-one’. 

Or consider the notion of product. In category theory, we say that the triple (P, a, b)isa 
product of objects A, B ifa : P — Aandb: P — B are arrows, and if for any f : C > 
A,g : C — B, thereis aunique arrow h : C — P suchthat f = a o h and g = b o h. See 
the left diagram in Figure 1.23. This notion unifies several constructions (each of which is 
the ‘product’ in an appropriately chosen category): Cartesian product of sets; intersection 
of sets; multiplication of numbers; the logical operator ‘and’; direct product; infimum 
in a partially ordered set; etc. Sum can be defined similarly, by reversing the orientation 
of all the arrows in the diagram for product (see the right diagram in Figure 1.23). 
This unifies the constructions of disjoint union, ‘or’, addition, tensor product, direct 
sum, supremum, etc. Of course the specific construction of sum and product depends 
sensitively on the category. For example, in the category Ab-Group, where objects are 
abelian groups and arrows are homomorphisms, the sum of the cyclic groups Z and 
Zy is their direct product Z2 x Z3 = Ze, while in the category Group, where objects 
are groups and arrows homomorphisms, the direct sum of Z and Z3 is PSL»(Z)! See 
Question 1.6.3. 

This generality of course comes with a price: it can wash away all of the endearing 
special features of a favourite theory or structure. There certainly are contexts where, 
for example, all human beings should be considered equal, but there are other contexts 
where the given human is none other than your mother and must be treated as such. 


1.6.2 Braided monoidal categories 


This book tries to identify the natural context for Moonshine. Categories more than 
sets provide the most appealing language for this context. The starting point for this 
formulation is braided monoidal categories. Standard references include chapter | of 
[534], chapter 1 of [32] and chapter XIII of [338]. 
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Fig. 1.24 The associativity pentagon. 


Let us try to translate the vector space tensor product into category theoretic language. 
The result, called a monoidal or tensor category, was obtained by MacLane (1963). 

Let U;, Vi, i = 1, 2, 3, be vector spaces, and choose any linear maps fj : U; > U j+1, 
gj: V; > Vj41, J = 1,2. Then the composition of the tensor product maps f; & 
gj : U; 8 Vj > Uj+i 8 Vj41 1s given by (f2 8 82) 0 (fi 8 81) = (f2 0 f1) 8 (82 0 81). 
This is exactly the same as saying that ‘&’ is a functor between the categories Vect x Vect 
and Vect, where the Cartesian product of categories has the obvious meaning. 

The tensor product should be associative up to isomorphism: for any objects U, V, W, 
there should be an isomorphism ayyw : (U 8 V)& W > U @(V @W) (called the 
associativity constraint). It should obey a consistency condition coming from the iso- 
morphism ((U Q V)&QW)& X =U @(V (W @ X)); that is, there are two ways of 
computing that isomorphism in terms of associativity, and the resulting isomorphisms 
should agree: 


(idy @ avwx) © au,vaw,x © (auyw 8 idx) = au,v,wex © duev,w.x:- (1.6.1) 


This is called the pentagon axiom, thanks to its depiction in Figure 1.24. 

Moreover, tensoring any object V with the one-dimensional vector space (call it ‘1”) 
must give back V, so there are isomorphisms ly :1@V > V,ry : V @1-— V. These 
are required to be consistent with the associativity constraint, by requiring the triangle 
axiom 


ry ®@idw = (idy ®@ lw) oayiw. (1.6.2) 


A monoidal category [397] is any category C possessing such a functor &®, with unit 1 
and invertible arrows /y,ry,duyw Satisfying (1.6.1) and (1.6.2). Of course Vect with 
tensor products is monoidal, as is Set with disjoint union. Braid is monoidal; the tensor 
product of an n-braid with an m-braid is the (n + m)-braid obtained by placing the two 
braids side-by-side. There are numerous other examples. The word ‘monoidal’ comes 
from ‘monoid’, meaning a group-like structure without inverses. 

MacLane proved two things. The first is coherence, which says that (1.6.1) and (1.6.2) 
are sufficient. Remarkably, any other consistency condition we may care to write down 
will be redundant. To give a random example, the identity involving a’s, /’s and r’s 
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Fig. 1.25 The hexagon equation. 


saying that the isomorphisms coming from U @((V @W)@U1@(X @Y))) =U Q 
(V O&W) ® (X @Y) must agree can be derived from the pentagon and the triangle. 

Secondly, MacLane proved that any monoidal category C is (monoidally) equivalent 
to a monoidal category C”“' where the associativity constraints are identity maps. Such 
a monoidal category is called strict; in it we can drop all associativity constraints as 
trivial, and with them all braces ‘(’ and ‘)’ in our tensor products. 

Now that we’ve handled associativity of the tensor product, let’s turn next to com- 
mutativity. We can’t expect anything like MacLane’s strictness to apply here — although 
the vector spaces U @ V and V @ U are naturally isomorphic, they are not equal. We 
proceed though in the same way. 

For any objects U, V , we have an invertible arrow (called a commutativity constraint) 
cuy :U @V — V QU. Some natural relations are 


Cu'v' o(f 8&2) = (e 8 f)ocuy, (1.6.3a) 
cyu © Cuv = idygv, (1.6.3b) 
Cu,vew = Cuv ° Cuw. (1.6.3c) 


The isomorphism (U @V)@W =(W & U) & V, or more explicitly the equation 
(cuw ®@ idy) 0 (idy 8 cvw) = cvev.w, (1.6.3d) 


is called the hexagon axiom (see Figure 1.25). 

Any monoidal category with commutativity constraints cyy obeying (1.6.3) is called 
a symmetric monoidal category (MacLane, 1965). Vect is an example. Another is the 
categories Rep g or Rep G of finite-dimensional g- or G-modules, for a Lie algebra g (or 
Lie group G), with tensor product. In fact, Tannaka—Krein duality states that a monoidal 
category with both product and sum, that looks like RepG (e.g. it has a unit object 
1, a contragredient, and all objects decompose intoa sum of simple ones), is RepG 
for a unique such group G. See, for example, section 9.4 of [398] for details and a 
generalisation. 

In 1985, Joyal and Street [321] suggested to drop the symmetry condition (1.6.3b). 
The resulting categories they call braided monoidal, for reasons that will be clear shortly. 
They also pointed out that there is a very convenient graphical calculus in such categories, 
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Fig. 1.26 The graphical calculus. 
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Fig. 1.27 The hexagon axiom revisited. 
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Fig. 1.28 The commutativity constraint c43 in Braid. 


which elegantly keeps track of all relations. Namely, write arrows vertically and tensor 
products horizontally. Composition is given by vertical concatenation. The left-most 
diagram in Figure 1.26 represents the arrow f &® g where f € Hom(U,V) and g € 
Hom(W, X), while the commutativity constraint cyy is depicted as in the right-most. 
The associativity constraint a4gc is ignored as we identify it with the identity. So we 
label strands with objects, which can change labels only at a box (‘coupon’). The hexagon 
axiom takes the form of Figure 1.27, which we recognise as two equivalent braids. One 
immediate consequence is that the category Braid described last subsection is braided 
monoidal, provided we define Cmn as in Figure 1.28. 

In terms of the graphical calculus, MacLane’s symmetry condition (1.6.3b) would 
permit us to slip one strand through another, reducing the content of a braid (i.e. some 
combination of commutativity constraints) to that of its underlying permutation. 

Joyal—Street also proved coherence for braided monoidal categories, that is equations 
(1.6.2a), (1.6.2c) and (1.6.3) are sufficient to establish the well-definedness of other 
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Fig. 1.29 The Yang—Baxter equation. 


isomorphisms involving associativity and commutativity. For a famous example, U @ 
V&W EW OV OU yields the Yang—Baxter equation 


(cvw @ idy )o(idy ® cuw)o(cuv @ idw) = (idw ® cuy )o(cuw ®@ idy )o(idy ® cvw), 

(1.6.4) 
which corresponds graphically to the braid equivalence of Figure 1.29 (compare Fig- 
ure 1.2). We return to the Yang—Baxter equation in Section 6.2.3. 

It’s not a coincidence that Figure 1.29 is a braid equivalence — it must be, since Braid 
is a braided monoidal category. Conversely, any braid equivalence yields an equation 
holding in any braided monoidal category. Braid is the least-common divisor of all 
braided monoidal categories, the one with commutativity constraints and nothing else, 
obeying the minimum possible relations — it is universal or free. More precisely: 


Theorem 1.6.1 [321] Let C be any (strict) braided monoidal category, and A any 
object in it. Then there exists a unique braided monoidal functor F : Braid —> C with 
Fd) = Aand F(ci,1) = CAVA: 


A ‘braided monoidal’ functor is one preserving the braided monoidal structure in the 
obvious way. The object ‘1’ of Braid denotes one hook, which generates via tensoring 
all other objects in Braid. This important theorem relates topology and algebra. 

The simplest example (in fact too simple) of such universality is the freeness of Z: given 
any group G with one generator g, there is a unique group homomorphism ọ : Z > G 
sending 1 € Z to g € G. Any such G defines an invariant for Z: the integer n is assigned 
the invariant (n). We call it an invariant, because equal integers must get assigned 
the same G-value, even if they look different (e.g. 3 and 2 — 1 + 2 superficially look 
different, but will be assigned the same G-value g(3) = g(2 — 1 + 2)). For example, 
the invariant g for G = Z2 = {[0], [1]} assigns [0] to any even n € Z and [1] to any 
odd n. Because ¢ is structure-preserving, computing this invariant is relatively easy. Of 
course integer invariants are not terribly exciting, because it is so easy to determine if 
two integer expressions (involving arbitrary sums and subtractions) are equal. 

Likewise, the universality of Braid means that, given any braided monoidal category 
C and any braid 8 € B,, we get a braid-invariant F (6) € Homc(A®", AS"). Here the 
object A® of C means A @ --- @ A (n times). It is not so difficult to determine directly 
whether two braids are the same (ambient isotopic) — for example, by ‘combing the braid’ 
(see e.g. pages 24-5 of [59]) — and thus these braid invariants are also not intrinsically 
valuable. But they are a stepping stone to something that is. 
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Fig. 1.30 A typical ribbon in Hom ((+, —), (~, —, +, +)). 


Fig. 1.31 Evaluation, coevaluation and twist. 


Theorem 1.6.1 implies that, in any braided monoidal category C, the braid group 6, 
acts on both Homc(U®", V) and Homc(U, V®") and the pure braid group P, acts on 
both Homc(U; ®--- @U,,, V) and Homc(U, V; 8 --- 8 V,,) (why?). Thus the groups 
governing braided monoidal categories are the braid groups 5, and P,,, while those of 
symmetric monoidal categories are the symmetric groups S,, (hence their names). 

If we continue with our project of categorising tensor product, we will be rewarded. We 
can introduce the notion of duals A* of objects (in the sense of the dual vector space), duals 
of arrows f* (the analogue of transpose of matrices), the evaluation map A* @ A > 1 
(the evaluation f(a) of a functional f € A* on a vector a € A), coevaluation 1 > 
A* @ A (let b; be a basis of vector space A and b¥ € A* the dual basis, then the element 
>>; bf Q b; € A* Q A is independent of the choice of basis). These obey the obvious 
relations (see, for example, chapter 1 of [534]) and the result is called a ribbon category; 
in place of the formal definition it suffices to give the universal ribbon category. 

The objects of Ribbon are ordered n-tuples A = (a1, ... , an) of signs, a; = +, for 
n > 0 (n = Ois the empty object 4). Hom(A, B) consists of isotopy classes of knotted 
linked twisted oriented strips, called ribbons. A strip can start at position i on the top 


(or position j on the bottom) only if a; = +1 (or b; = —1, respectively); similarly, it 
can end ati or j only ifa; = —1 or b; = +1 — see Figure 1.30. Braiding is as before. 
The dual of (a1, ..., an) is (—an, ..., —a1), and the dual of a ribbon is given by rotation 


through 180°. The evaluation and coevaluation are given in Figure 1.31. 

We use ribbons (strips) rather than links (strands) because the 360° turn depicted on 
the right of Figure 1.31 cannot be straightened without introducing a twist in the strip. 
Up to isotopy, a ribbon can be thought of as braided knotted strands (the spine of each 
strip) together with an integer assigned to each strand (saying how much that strip is 
twisted). 
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As on the left of Figure 1.26, it is very useful to colour ribbons. Let S be any set; by 
Ribbons we mean the category with objects ((A1, 51), ..., (Ag, Sk)) for A; € S,5; € {+}. 
The arrows are as before except now they are coloured with A € S; if the ribbon has 
endpoints, they must be of the form (A, s) and (A, s’) where the signs s,s’ are as 
before. 

Two isotopic ribbons define an identity holding in any ribbon category: 


Theorem 1.6.2 [473] Let C be a (strict) ribbon category and let S be the set of 
its objects. Then there exists a unique ribbon functor F : Ribbons —> C such that 
F(A, +) = A and F(A, —) = A*. 


By the usual arguments, any ribbon category C gives us (isotopy) invariants of braided 
knotted ribbons. The most interesting (because it is the simplest) special case con- 
cerns any ribbon R € HOMRribbon(ð, Ø) without ends: the invariant F(R) will lie in 
Homc(F (Ø), F(@)). This gives an invariant for any link, by drawing its ribbon with zero 
twist for each strip. Of course some ribbon categories give a complete link invariant 
(why?). 

Unlike for braids, we have no effective way to determine if linked ribbons or links 
are ambient isotopic (but see [283]), so these invariants are topologically interesting. 
For example, they permit an easy proof that the trefoil and its mirror image are not 
ambient isotopic, something that took a clever argument from Dehn to do originally. The 
functoriality property of F makes them relatively easy to compute. 

It is far from obvious that there are any nontrivial calculationally practical examples 
of ribbon categories, independent of Ribbon. Fortunately though there are: although 
Ribbon is geometric, there are several ribbon categories coming from algebra (namely 
representation theory). In fact, there are now so many that the main value of Theorem 1.6.2 
is organisational, conceptually gathering together a plethora of link invariants that have 
been accumulating since the 1980s, starting with the Jones polynomial. 

This treatment can and should be pushed much further, starting with the direct sum 
U @V of objects. See [534], [398], [353] for more details and developments. The 
refinement called modular category is the one of greatest relevance to the mathematics 
and physics related to Moonshine. We return to categories in Section 4.4.1. 

Vogel [547] defined a monoidal category D’, which looks like the category of modules 
of a Lie algebra. He calls it the Universal Lie algebra, since given any simple Lie 
(super)algebra g, there is a unique functor from D’ to the category of g-modules satisfying 
certain natural properties. Roughly, Vogel assigns each such Lie (super)algebra a different 
point on the projective plane, from which much of its data can easily be computed. For 
example, the A-series corresponds to the projective coordinates [n, 2, —2], while the 
exceptional series (the bottom row of Figure 1.20) falls on the line [—2, a + 4, 2a + 4]. 
The ‘universal decompositions’ and dimension formulae described in Section 1.5.2 arise 
because they hold for D’. 


Question 1.6.1. A variety is a solution set to a system of polynomial equations over some 
ring R. Interpret this as a functor from a category of rings to a category of sets. 
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Question 1.6.2. (a) Find what (if anything) product and sum (in the sense of Figure 1.23) 
are in the category Set. 
(b) Same question for the category Riem. 


Question 1.6.3. (a) Show that in the category Ab-Group (where objects are abelian 
groups and arrows are group homomorphisms), sum and product are identical. 
(b) Show that in the category Group, product is direct product, but sum is not. 


Question 1.6.4. Let L be any lattice (Section 1.2.1). Define a category whose objects are 
elements of L, with Hom(v, v) = C and Hom(v, w) = {0} whenever v 4 w. Composi- 
tion of arrows is multiplication. Complete the construction of a ribbon category for this 
category, where the braiding cy,» is e 


iv-w 


1.7 Elementary algebraic number theory 


The coefficients of the McKay—Thompson series T, are always integers, as are the fusion 
multiplicities M£, in RCFT. But non-integers often lurk in the shadows, secretly watching 
their more arrogant brethren the integers strut. One of the consequences of their presence 
can be the existence of certain Galois symmetries. The Galois theory of cyclotomic fields 
plays a background role in Moonshine, much as it does for finite groups and modular 
forms. We sketch the basics in this section. 

Galois automorphisms are a generalisation of complex conjugation. If in your prob- 
lem complex conjugation seems interesting, then there is a good chance other Galois 
automorphisms will play a role. 


1.7.1 Algebraic numbers 


Euler and Lagrange were the first to show that ‘weird’ (complex) numbers could tell us 
about the integers, but it took Gauss (c. 1831) to do this with care and subtlety. For an 
example of this idea, suppose we are interested in the equation n = a? + b’. Consider 
for concreteness 5 = 2? + 17. We can write this as 5 = (2 + i)(2 — i), so we are led 
to consider complex numbers of the form a + bi, for a,b € Z. These are now called 
“Gaussian integers’. 


Fact Let p € Z be any prime number. Then p factorises (i.e. is composite) over the 
Gaussian integers iff p = 2 or p = 1 (mod 4). 

Now suppose p Æ 3 (mod 4) is prime, and factorise it p = (a + bi)(c + di). Then 
p? = (a? + b*)\(c? +d’), soa? + b? = c? + d? = p. Conversely, suppose p = a? + b?, 
then p = (a + bi)(a — bi). Thus: 

Consequence! Let p € Z be any prime number. Then p = a* + b? for a,b € Z iff 
p =2o0r p = 1 (mod 4). 


15 This result was first stated by Fermat in one of his infamous margin notes (another is discussed shortly), 
and was finally proved a century later by Euler. For a one-line proof see Question 1.7.1. 
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Now we can answer the question: when can n be written as a sum of two squares n = 
a? + b°? Write out the prime decompositionn = [| p°”. Thenn = a? + b? has a solution 
iff a, is even for every p = 3 (mod 4). For instance, 60 = 2? .3!. 5! cannot be written 
as the sum of two squares, but 90 = 2! .32. 5! = {114+ )3(1 + 2) }{(1 — )3(1 — 2i)} 
can (e.g. 90 = (—3)? + 97). In fact we can find and count all solutions. 

More generally, let IK be any subfield of C (usually we take K = Q) and qj,..., Œn 
be any complex numbers. We discussed ‘field’ in Section 1.1.1. By L = K(qy, ..., æn) 
we mean the smallest field containing K and all @;. In other words, L consists of all 
rational functions poly/poly of the œ;, with coefficients in K. Then L can be thought of 
as a vector space over K; write [IL : K] < oo for the dimension of that vector space. We 
say L is an extension of the base-field K of degree [L : K]. The most interesting case is 
when the degree [L : K] is finite. In this case we can find a single number a € L such 
that L = K[a], where as always we write R[x] for all polynomials in x with coefficients 
in R. Then « will be a zero of a monic polynomial p(x) € K[x] and of degree [L : K], 
called the minimal polynomial of a. Such a are called algebraic, and such extensions 
Kig] are called finite. The finite extensions most relevant for this book are discussed in 
Section 1.7.3. 

Numbers of course arise throughout science in their role as coordinates; less appre- 
ciated is that observing the specific kinds of numbers that arise can provide profound 
structural information. This is very much how algebraic number theory impinges on the 
areas considered in this book. For an elementary example, recall that Euclid’s books 
are filled with geometric constructions, particularly those involving straight-edge (i.e. 
drawing the line passing through two points) and compass (i.e. drawing the circle with 
given centre and radius). The reader can discover for herself how to trisect line seg- 
ments and double the area of a square, using only straight-edge and compass. But some 
problems weren’t solved back then: for example, how to trisect an angle or double the 
volume of a cube. To solve these, consider coordinates. Suppose we start with N points 
(xi, yi). We can construct the line joining any two of those points, and the circle centred 
at some (x;, y;) with some radius |(x;, yj) — (Xx, yx)|; we can construct new points only 
as intersections of these lines and circles. Now, if we let K denote the field generated 
from Q by all 2N coordinates x;, y;, then the equations of our lines and circles will 
have coefficients belonging to K. The coordinates of the intersection of any two such 
lines will lie in IK, while that of the intersection of a line with a circle, or of two cir- 
cles, will lie in an extension L of K of degree [L; : K] = 2. Continuing in this way, 
we See that any construction, no matter how involved, can only construct points whose 
coordinates lie in some extension L of K of degree a power of 2. Now, given an angle 
0, defined by points (0,0), (1,0) and (cos(@), sin(@)), trisecting 0 means constructing the 
point (cos(@/3), sin(@/3)). Buta = cos(@/3) obeys cos(@) = 4a? — 3a, ie. cos(@/3) lies 
(generically) in a degree-3 extension of K = Q[cos(9), sin(@)]. Thus we cannot trisect 
that angle, using only a compass and straight-edge, for most 6 (e.g. 0 = 60°). 

The degree [L : K] is a (rather crude) invariant of the field extension L > K. We 
have just seen the power of this simple invariant; in the next subsection we refine it 
considerably, giving it a group structure. 
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Consider ‘Fermat’s Last Theorem’, which asserts that there are no positive integer 
solutions to the equation x” + y” = z”, for n > 2. It is tempting, as Fermat himself 
probably did, to factorise this into 


n—1 
2 ty 
[Betin o) =2", 


j=0 


kas 


2j+1 
i+ has an 


where Em = exp[27i/m], and to try to show from this that each a + &5" 
‘integral’ nth root, if x = a, y = b, z = c is an integral solution. We return to Fermat’s 
Last Theorem in Section 2.2.1. 

These examples should give the reader some appreciation for the value of using non- 
integers to study integers, and also provide some impetus for extending the tools and 
notions of high school number theory (primes, divisibility, etc.) to complex numbers. 
The result is algebraic number theory. A classic introduction is [282]; the book [515] is 
filled with concrete examples. 

Euler worked with numbers of the form £ + m/n, for £,m,n € Z, and regarded them 
as generalised integers, carrying over (without proof) their divisibility laws, etc. from 
the usual integers. However, it was soon learned that care must be taken. For a simple 
example, the factorisation2 = (n — ~n? — 2)(n + Vn? — 2)holds foralln € Z, so what 
should the ‘unique prime factorisation’ of 2 be? 

The basic theory was developed in the nineteenth century, by Kummer, Dedekind, 
Frobenius and others. Take the base field K to be Q for convenience, and fix a finite 
extension L = Q[a]. Any z € L is algebraic, i.e. satisfies amz” + am-1z”7! +--+ 
ao = Oforsomea; € Z (not all zero). The L-integers are those numbers z € L that satisfy 
Z” + am-1z”7! +... + ao = 0 for some a; € Z (i.e. am = 1). The sum and products of 
L-integers are L-integers, and so we call the set Ry of all these L-integers the ring 
of integers. For example, when L = Q, Q[i], Q[V2] and Q[V5], respectively, the ring 
of integers are Z, Z + iZ, Z + /2Z, and 


{(m +nV/5)/2|m,n € Z, m—n € 22}, 


respectively. All elements of L are quotients of L-integers, just as allr € Q equal a/b 
fora, b € Z. 

What should prime mean here? The obvious guess would be any number y € RL 
whose only divisors £ are trivial, i.e. the only 6 € R, with y/B € Ry are units or y 
times units. Units are the analogue here of +1: an L-integer u is a unit iff u~! is also an 
L-integer. The only problem with this definition of prime is that unique factorisation is 
usually lost. For example, in L = QIV=26], the L-integers are Z + „/—26Z; we have 
the equation 


33 = 27 = (1 + V—26)(1 — V—26) 


and yet, as the reader can easily verify, both 3 and | + ./—26 are primes by our definition. 
Incidentally, most finite extensions L have infinitely many L-units (e.g. (1 + V2)" is a 
unit of Q[V2] for any n € Z). 
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The correct definition of prime (Dedekind, 1871) is a gem. Replace the single L- 
integer y € Ry, with the set of all multiples Ry =: (y) of that number. This washes 
away the irritating ambiguity due to units. Any subset J C Ry closed under R,-linear 
combinations (i.e. for which }` a;z; € I for all z; € J and a; € Ry) is called an ideal of 
R,. For example, (y ) is always an ideal, though for typical rings Rg, most ideals won’t 
have a single generator. Consider any ideals 7, J of R. By the product of ideals we 
mean 


1J = |Y abila eT, bed}. 


A prime ideal is defined to be any nonzero ideal P # Ry, such that ZJ = P for ideals 
I, J only if 7 = R, or J = Ry. In Ru, any prime ideal P is maximal (and conversely): 
the only ideals J satisfying P C J C Ry, are I = P, R,. Although unique factorisation 
usually won’t hold for L-integers, it always holds for ideals: any nonzero ideal J of the 
ring R, of integers can be written uniquely as a product of prime ideals. 

For example, the prime ideals of Z are (p) for p prime, and this reduces to the 
usual unique factorisation of integers. The unique factorisation of the ideal (27) in 
the field Q[./—26] is (27) = Pipe, where P4 := (3, 1 + /—26) = (3) N (1 + /—26). 
Thus neither (3) = P} P_ nor (1 + /—26) = PÈ are prime. 

We are thus led to picture L-integers as ideals of the ring Ru. In fact the name ‘ideal’, 
now standard in algebra, was chosen because it corresponds to an ideal — as opposed to 
true — number. 

This reinterpretation of integers as ideals has a striking geometric parallel. We are 
taught to study a geometric space X through the functions f € C[X] that live on it. 
In this language, what should play the role of a point x € X? Given any pointa € X, 
we can evaluate these functions f(x) at x = a. Algebraically, this corresponds to a 
homomorphism C[X] —> C. Those homomorphisms, via their kernels, are essentially 
in one-to-one correspondence with ideals of the ring C[X], and thus we should identify 
points x € X with certain ideals in CLX ]. Looking at concrete examples such as X = C”, 
we find that ideals correspond more generally to submanifolds (subvarieties) in X, and 
that maximal ideals correspond to points. This unexpected and deep connection between 
number theory and geometry is a great illustration of the effectiveness of abstract algebra. 


1.7.2 Galois 


Evariste Galois was a brilliantly original French mathematician. Born shortly before 
Napoleon’s ill-fated invasion of Russia, he died shortly before the ill-fated 1832 uprising 
in Paris. His last words: ‘Don’t cry, I need all my courage to die at 20’. 

Galois grew up in a time and place confused and excited by revolution. He was known 
to say ‘if only I were sure that a body would be enough to incite the people to revolt, 
I would offer mine’. On 2 May 1832, after frustration over failure in love and failure 
to convince the Paris mathematical establishment of the depth of his ideas, he made 
his decision. A duel was arranged with a friend, but only his friend’s gun would be 
loaded. Galois died the day after that bullet perforated his intestine. At his funeral it was 
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discovered that a famous general had also just died, and the revolutionaries decided to 
use the general’s death rather than Galois’ as a pretext for an armed uprising. A few days 
later the streets of Paris were blocked by barricades, but not because of Galois’ sacrifice: 
his death had been pointless [529].!° 

Galois theory in its most general form is the study of relations between objects defined 
implicitly by some conditions.'’ For example, the objects could be the solutions to a given 
differential equation. Or the objects could be the different points 7~!(p) C Y sitting 
above a given point p € X in a cover x : Y — X. In the most familiar incarnation of 
Galois theory, the objects are the zeros of certain polynomials. 

Look at complex conjugation: WZ = WZ and w +z = wW +Z. Also, ¥ = x for any 
x € R. So we can say that z > Z is a structure-preserving map C — C (called an auto- 
morphism of C) fixing the reals. We say that complex conjugation belongs to the Galois 
group Gal(C/R) of C over R; apart from complex conjugation, it contains only the 
identity automorphism. 

A way of thinking about the automorphism Z is that it says that, as far as the real 
numbers are concerned, i and —i are identical twins. Algebra alone can’t tell that i is in 
the upper half-plane, or that going from 1 to i is going counterclockwise about 0, while 
1 to —i is clockwise. 

Let L be any field containing Q. The Galois group Gal(L/Q) is the set of all 
automorphisms=symmetries of L that fix all rationals. 

For example, L = QLV5] is the field of all numbers of the form a + b/5, where 
a,b € Q. Let’s try to find its Galois group. Let o € Gal(F/Q). Then o(a + by/5) = 
o(a)+ o(b)o (4/5) =at ba(v/5), so once we know what o does to /5, we know 
everything about o. But 5 = o (5) = o(/5*) = (o(V/5)), so o (v5) = +,/5 and again 
there are precisely two possible Galois automorphisms here (one is the identity). As far 
as the arithmetic of Q is concerned, +./5 are interchangeable. 

Consider more generally any extension L of the base field K of degree n = [L : 
K] < œ. As mentioned in the last subsection, these are always of the form L = K[a], 
where « is the root of a monic polynomial p(x) of degree n with coefficients in K. 
This means any z € L is expressible as a polynomial in œ with coefficients in K, of 
degree < n. Hence, any automorphism o € Gal(L/K) is uniquely specified by the value 
o(a) € L. Since o(p(x)) = p(ox), o must send « to one of the n roots of p(x). Thus 
||Gal(L/IK)|| < [L : K]. Extensions L for which Gal(L/K) is maximally large (i.e. of 
order n) are the most interesting and are called Galois: they are the extensions for which 
all roots of p(x) are in L. 


16 Apparently this treatment of Galois’ life has been disputed. But surely the main purposes of history are for 
supplying a context and motivation, for its sheer entertainment value, and for drawing Lofty Morals. And 
at least when they are successful, it is probably wisest if neither motivation nor entertainment nor Morality 
be investigated too closely... 

17 This is the dynamic point of view, but the reader should be warned that there is an alternate interpretation. 
Abstracting out the more structural side of Galois theory, many authors regard Galois theory as ultimately 
a contravariant functorial correspondence associating to some objects A, B, ... (e.g. groups) other objects 
K,L,... (e.g. fields invariant under the group action) in such a way that ACB corresponds to KDL. 
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Let L D K be a finite Galois extension, and write G := Gal(L/K). The classical 
Galois Theorem sets up a natural bijection between fields J, L > J > K, and subgroups 
H of G. In particular, to the field J associate the subgroup H = Gal(L/J), and to the 
subgroup H associate the space (in fact field) J = L” of all elements z € L fixed by all 
o € H. Then [J : K] = ||G/A ||, and the extension J D K is Galois iff H is normal in 
G, in which case Gal(J/K) = G/H. 

We saw earlier the power of the numerical invariant [L : K]. We should think 
of Gal(L/K) as a group-valued refinement of degree. For an application, suppose 
for contradiction that we have a general formula for the zeros of any polynomial 
anx” +n x"! +++++ao of degree n. For n = 2 we have the quadratic formula 
(which involves square-roots), and we’ve all seen the formula for n =3 (which 
involves square-roots and cube-roots). Does there exist a formula for any n, involv- 
ing taking arbitrary nested roots of rational expressions in the coefficients a;? Let 
K = Qlao, ..., an, E1, £2, ...] — we include in K all roots of unity so that all exten- 
sions below will be Galois. Then the first kth root we come to in our formula will move 
us into a Galois extension K, of K, with Galois group Gal(K) /K) = Z,. If the hypothet- 
ical formula involves a second radical, requiring us to take say an ¢th root of a rational 
expression in KK), then this takes us into a Galois extension K3 of Kı, with Galois group 
Gal(K2/K,) = Ze — that is, Gal(K2/K) is an extension of the cyclic group Ze by Zg. 
Continuing in this way until all roots in our hypothetical formula are exhausted, we would 
find that the zeros of the general degree-n polynomial would lie in a Galois extension L 
of K whose Galois group is obtained by repeatedly extending by cyclic groups. Such a 
group is called solvable (Section 1.1.3) for this reason. It is easy to see that Gal(L/K) 
here is in fact the symmetric group Sn, and that S, is solvable iff n < 4 (recall that A5 
is simple!). Thus a general formula for the roots of a general polynomial of degree n, 
involving nested radicals, can exist only forn < 4. 

Every area of mathematics has a Galois-type theory. In geometry, for instance, covers 
f :M — N of a fixed manifold N are in one-to-one correspondence with subgroups 
H =2(M) of the fundamental group G := m (N); y € 2,(N) belongs to H iff y lifts 
to a closed loop in M. When the subgroup H is normal, G/H is naturally isomorphic 
to the group of all homeomorphisms a: M —> M satisfying f oa = f (these a are 
called covering transformations). See the beautiful book [363]. The question ‘What is 
the Galois theory for von Neumann algebras?’ led Jones to subfactor theory M D N —for 
instance, his index [M : N] € RU {oc} plays the role of the degree [L : K] € Z U {oo}. 
Just as the degree [L : K] can be refined into the Galois group Gal(L/K), the Jones index 
can be refined into a topological field theory (see Section 6.2.6). 

Galois theory is reminiscent, at least qualitatively, of Gddel’s Incompleteness Theo- 
rem. In mathematics we generally start with a model (e.g. Euclidean geometry or the 
natural numbers) that we try to capture implicitly by an axiomatic system. Gdédel’s 
Theorem tells us that there are infinitely many different models compatible with the 
given axiomatic system, regardless of how many axioms we include. Each of these 
is obtained by realising in incompatible ways the undefined terms of the axiomatic 
system. 
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Of course it is the model and not the axiomatic system in which most mathematics 
occurs. For example, we don’t criticise Wiles’ work on Fermat’s Last Theorem on the 
grounds that his proof assumes N is embedded in C, even though this transcendental 
interpretation of N surely is not a consequence of Peano’s axioms (the axiomatic system 
describing the natural numbers). Likewise, [459] gives a simple statement about N; it is 
easy to prove using standard arguments involving R, but neither it nor its negation can 
be proved using only Peano’s axioms. 


1.7.3 Cyclotomic fields 


We are primarily interested in a simple class of numbers: those in the cyclotomic exten- 
sions of Q. These are the fields Q[€,, ], consisting of all polynomials amé} + am-1Ẹ”7! + 
--- + do in the root of unity &, := exp[2zi/n], for all a; € Q. For instance, cos(zr), 
sin(zr) and ./r are cyclotomic numbers for any r € Q. In particular, 


cos (27) a ea (1.7.1a) 
n 2 
m a Er 


: m F 
sin (2:7) = Sn (1.7.1b) 
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for any nonzero m,n € Z, and any odd prime p, where c, = 1 or —i for p = 1 
(mod 4), respectively ((1.7.1c) is called a Gauss sum). Only countably many complex 
numbers are cyclotomic, i.e. lie in UPS, Qlé,], so almost every complex number is not 
cyclotomic. 

Cyclotomic numbers are the numbers in the character tables of finite groups, the values 
of Lie group characters at elements of finite order, the values of quantum-dimensions in 
RCFT, and the matrix entries in the SL2(Z)-representation coming from rational VOAs. 
The theory is deeply entwined with that of modular forms and functions, as we see 
in Section 2.3.3. The key property of cyclotomic numbers, which accounts for their 
ubiquity, has to do with their Galois groups. 

As usual, an automorphism o € Gal(Q[é,]/Q) is uniquely determined by what it does 
to the generator &,. Since é? = 1, we see that o must send é, to another nth root of 1, Er 
say; in fact o (&,,) must be another ‘primitive’ nth root of 1, that is £ must be coprime to n. 
So Gal(Q[E,, ]/Q) is isomorphic to the multiplicative group Z% of numbers between 1 and 
n coprime to n. To see what o does to some z € Q[é,,], we find the £ € ZX corresponding 
to ø and write z as a Q-polynomial p(&,): then oz = pE). For example, 


ake \ ee Ee 
2 5 2 


o (cos(2xa/n)) =0 ( = cos(27al/n). 


The defining property of cyclotomic numbers is a central result of classical number 
theory: 
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Theorem 1.7.1 (Kronecker—Weber) Let L be a finite Galois extension of Q with 
abelian Galois group Gal(L/Q). Then L is contained in some cyclotomic extension 


QlEnI.- 


The proof is quite complicated. Conversely, any cyclotomic extension QJ[é,] of Q is 
finite Galois and has abelian Galois group. In fact, the degree of Q[é,] is given by 
Euler’s ¢-function: 


[Olé : Q) =o) =P] 2—. 
pin P 
The minimal polynomial of &, is called the nth cyclotomic polynomial; a manifestly 
integral construction for it is given in [64]. Its zeros are £f for each i coprime to n. 

The ring of cyclotomic integers Royz,] is simply Z[é,]. For all n Æ 1, 2, 4, Q[E,] has 
infinitely many units: for example, (£ — 1)/(& — 1) is a unit of infinite order, for any 
1 <i <n-— | coprime ton. Unique factorisation at the level of numbers (as opposed to 
ideals, which always holds) fails in all but 30 cyclotomic fields (Q[&23] is the first field 
for which it fails). 

Kronecker’s Jungentraum (‘dream of youth’) [546] proposes that just as all abelian 
extensions of Q are obtained by adjoining to Q the values of a transcendental function 
(namely exp[2z71z]) at certain algebraic numbers (namely z € Q), something similar 
should happen for abelian extensions of other finite extensions K. This is still far from 
understood in general, but we know that any abelian extension of K = QIV =d] is 
contained in an extension of K by a root of unity, square-roots of integers, and the 
j-function (0.1.8) evaluated at (a + /b)/ 2 for some a,b € Z. 


Question 1.7.1. [572] (a) Show that a prime p = 3 (mod 4) cannot be written in the form 
a? + b? for integers a, b. 
(b) Let p = 1 (mod 4) be prime. Define 


Sp ={@, yz) € Z’ |x >0,y >0,2>0, x? +4yz = p}. 
Verify that for any (x, y, z) € S,, both x # y — z and x Æ 2y. Define a map L on S, by 


(x+2z,z,y-x-z) ifx<y-z 
L(x, y,z)= 4 Qy-—x,y,x-yt+z) ify-z<x <2p 
(x —2y,x — y +z,y) ifx > 2y 


Verify that L is an involution (i.e. L(L(x, y, z)) = (x, y, z)), and that L has exactly one 
fixed point. Show that this implies that the cardinality ||’, || must be odd, and thus that 
the involution (x, y, z) œ> (x, z, y) must also have a fixed point. Conclude that any prime 
p = 1 (mod 4) has a solution p = a? + b°. 


Question 1.7.2. Suppose we are given two points P, Q in the plane, distance 1 apart. 
Determine whether it is possible, using only a straight-edge and compass, to construct a 
point R collinear with P and Q such that the distance between P and R is 2~!/3, What 
if the distance between P and R is instead required to be 271/4? 


Elementary algebraic number theory 103 


Question 1.7.3. Let K = Q[2!/3]. Show that Gal(K/Q) is trivial. 


Question 1.7.4. Let L = Q[ V2, V3]. 

(a) Find an @ such that L = Q[a]. 

(b) Find Gal(L/Q). Is L Galois? 

(c) For each subgroup H of Gal(Q[V2, V3] /Q,, find the corresponding extension J. 


Question 1.7.5. (a) Show that the values ch(g) of characters are always cyclotomic inte- 
gers. After reading this section, can you add anything to your answer to Question 1.1.5? 
(b) Let G be any finite group. Prove: G is simple iff for all irreducible characters ch of 
G, ch(a) = ch(e) only when a = e. 


Question 1.7.6. Find all rational numbers r such that cos(27r) € Q. 


2 
Modular stuff 


This chapter introduces modular functions and forms, a subject central to the remainder 
of the book. Some earlier parts of this chapter are beautifully covered in [414]. 

Section 2.1 supplies the underlying geometry, but can be skimmed on a first reading. 
In spite of this background material, the theory of modular forms and functions discussed 
in Sections 2.2 and 2.3 will probably appear as somewhat arbitrary to the uninitiated 
reader. Section 2.4.1 addresses some of this apparent artificiality, by developing the 
broader context of automorphic forms. 

As explained in the introductory chapter, Moonshine involves unexpected occurrences 
of modularity. The modularity of Moonshine functions follows from Zhu’s Theorem 
(Theorem 5.3.8). However, the complexity of the underlying mathematics begs the ques- 
tion: Can modularity be established in a more elementary way? The simplest example of 
Moonshine involves theta functions. Hence we explore the limits and potentials of four 
classical strategies for proving the modularity of theta functions: Poisson summation, 
Dirichlet series, the heat kernel and representations of Heisenberg groups (Sections 2.2.3, 
2.3.1, 2.3.4 and 2.4.2, respectively). 

Moonshine has really only been worked out in genus 1,! but conformal field theory 
tells us that there is an analogue for every genus (Section 6.3.1). It will be much more 
complicated, but it will be more rewarding because the number theoretic side is much less 
developed. In other words, we will find traces of, for example, the Monster in automorphic 
forms for the higher mapping class groups I’,,, and Sp2,(Z). We include Sections 2.1.4 
and 2.3.5 in anticipation of this most natural and significant future development. 


2.1 The underlying geometry 
2.1.1 The hyperbolic plane 


The birth of hyperbolic geometry is one of the most remarkable and instructive in the 
history of mathematics. Euclid’s Fifth Postulate* was noticeably more complicated than 
the other axioms, looking more like a theorem than a self-evident proposal. Indeed, its 
converse was a theorem proved by Euclid. For example, compare it with Euclid’s First 


1 There are two possible meanings of ‘genus’ in a phrase like ‘higher genus Moonshine’. Ordinary 
Monstrous Moonshine is genus 0 in the sense that the j-function is a Hauptmodul, i.e. a function on a 
sphere. It is genus 1 in the sense that the argument t of j parametrises different tori. In this paragraph we 
are anticipating Moonshine’s extension to higher genus in this second sense. 

2 Also called the Parallel Postulate, it is equivalent to the simpler statement: Given any line L and a point p 
not on L, there is a unique line parallel to L that passes through p. 
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L >». NAN 


Fig. 2.1 Several parallel lines in the hyperbolic plane H. 


Postulate: There is a unique line passing through any two points, or Euclid’s Fourth 
Postulate: All right angles are equal. For centuries, starting with Archimedes, math- 
ematicians (both professional and amateur) tried to prove it from the other axioms. 
Finally in 1868 Beltrami established its independence by finding models for the hyper- 
bolic plane, proving the conjecture of Gauss, Bolyai and Lobachevski as to the existence 
(i.e. internal consistency) of this non-Euclidean geometry. (More precisely, Beltrami’s 
models reduced the question of the consistency of hyperbolic geometry to the consis- 
tency of Euclidean geometry.) Far from being an artificial construct, we’ve now learned 
that hyperbolic geometry is far more important than Euclidean geometry, at least in two 
and three dimensions. 
In place of the Euclidean plane R?, consider the upper half-plane 


H := {(x, y) € R? | y > 0} = {r € C|Imr > 0}. (2.1.1) 


The angles between intersecting curves in H are measured as in R? (namely, take the 
angle between the two Euclidean lines tangent to the curves at the point of intersection). 
However, the hyperbolic lines consist of all half-lines perpendicular to the x-axis, together 
with all semi-circles with centre on the x-axis (see Figure 2.1). All axioms of Euclidean 
geometry hold here (e.g. between any two distinct points there passes a unique line), 
except for the Parallel Postulate: there are always infinitely many hyperbolic lines parallel 
to a given hyperbolic line L and passing through a given point p ¢ L. 

It is possible to prove from the other axioms that the remaining possibility (namely 
that there are no lines parallel to line L through point p) cannot occur. Nevertheless, 
there is a second kind of non-Euclidean geometry, called spherical geometry. In place of 
R? we have the sphere S$”, and lines now are great circles. If we identify antipodal points 
+p € S*, then we get a geometry satisfying most of Euclid’s axioms. The exceptions are 
that we can’t speak unambiguously of the portion of a line between two points, and the 
Parallel Postulate (there are no parallel lines). Spherical geometry is older than Euclid — 
we needed it, for example, in our study of the night sky. 

In Euclidean R? the metric (infinitesimal length-squared) is given by ds? = dx? + dy’, 
and so the are-length of a curve y : [0, 1] > R? is 


1 
length(y) := i Vit? + y(t dt. 


On H the arc-length of a curve y : [0, 1] —> H becomes 


1 7 2 7 2 1 1 
lengthy = vrO + nO a= f Ol ay. (2.1.2) 
0 0 


y2(t) Im y(t) 
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Define the hyperbolic distance disty(p, q) between two points p, q € Hto be the infimum 
inf, lengthy(y) of the arc-lengths of all paths y between p = y(O) and q = y (1). Just 
as the shortest path (geodesic) between two points in Euclidean geometry is the line 
segment between them, so in hyperbolic geometry it is the hyperbolic line segment. 

The ‘boundary’ for R? can be thought of as the circular horizon of ‘points at infinity’, 
parametrised by angle, and every line touches this circle at two points. Likewise, the 
boundary of H can be thought of as the circle R U {oo}, and again every line touches 
this circle at two points. This circle will appear as the infinitely distant horizon to beings 
living in H. The point ‘oo’ here is often written ico to emphasise its relation to the 
vertical lines. The difference is that in R?, all parallel lines share the same two points at 
infinity; in H, parallel lines share at most one point at infinity. 

The most compelling model of the hyperbolic plane is perhaps the Poincaré disc 


D :={zeEC||z| < 1}. 


Here, angles are again as in R?, but lines consist of diameters of the boundary circle 
|z| = 1, together with the intersection of D with circles hitting the boundary |z| = 1 at 
right angles. The metric is |dz|?/(1 — |z|?)?, and the ‘points at infinity’ form the boundary 
circle |z| = 1. The equivalence with H is given by the isometry t > Ii taking H onto D. 

It may seem strange that both models H and D of hyperbolic geometry have a distorted 
notion of length and line. Is there any way to realise hyperbolic geometry, using a surface 
embedded in R? inheriting the usual metric and angle of R*? Hilbert proved the answer 
is No: There is no complete surface in R? with constant negative curvature (see e.g. 
page 51 of [527]). Nash’s Theorem (footnote 5 in chapter 1) implies though that there 
will be an embedding of the hyperbolic plane in some R” (n = 5 works). ‘Complete’ 
means that any Cauchy sequence converges, so there aren’t any points missing. To find 
the curvature of a surface at a point, first find the smallest and largest circles hugging 
the surface the closest at that point; the curvature is the inverse product r~!R7! of their 
radii. For example, a sphere of radius r has constant curvature r~?. A surface with 0 
curvature is (locally) flat in one direction — for example, a cylinder or torus has constant 
curvature 0. The small and large circles for a surface & with negative curvature have 
centres on opposite sides of the tangent plane T, ©}, like a saddle curving up from front 
to back, but curving down from side to side. The hyperbolic plane has constant negative 
curvature (Theorem 2.1.4(b)). 

What is the significance of the word ‘hyperbolic’ here? It was chosen by Klein, partly 
because sinh and cosh appear in many formulae, but also because of another model 
of H. Consider the hyperboloid x? + x2 — x? = —1, embedded in Minkowski space 
R>! (so it is a Minkowski sphere of radius i). It consists of two sheets; let’s focus on 
the upper one (where x3 > 1). As a surface in R>!, it inherits its notions of angle and 
metric ds? = dx? + dx? — dx? — in particular this induced geometry is equivalent to 
the hyperbolic plane. The lines here consist of the intersection of planes through the 
origin with the upper sheet (when those intersections are non-empty). Stereographic 
projection from the point (0, 0, — 1) conformally maps the upper sheet onto the Poincaré 
disc D x {0}. 
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Just as the area of a region R C R? is given by the double integral f p dx dy, so is the 
hyperbolic area of region R C H given by 


dxd 
areag(R) := f An (2.1.3a) 
R Y 


This just says that the hyperbolic area of the infinitesimal rectangle [x, x + dx] x [y, y + 
dy] is the product z x > of hyperbolic length with hyperbolic height. This area formula 
fails for macroscopic rectangles, if for no other reason than that there are no macroscopic 
rectangles! In fact, one of the most remarkable formulae of geometry must be the expres- 
sion, originally due to Lambert (1766), for the area of a triangle T in terms of its interior 
angles a1, a2, 03: 


areay(T) = T — a) — a) — Q3. (2.1.3b) 


More generally, the area of an n-sided hyperbolic polygon is (n — 2) — }_; a;. From 
this we obtain the non-existence of rectangles. These formulae apply even in the limiting 
case where some vertices lie on the boundary R U {ioo}. In particular, the area of any 
hyperbolic triangle is bounded above (even though H itself has infinite area)! 

Klein proposed to study geometry using the group of symmetries of whichever geo- 
metric quantities are important to the context (Section 1.2.2). The group Isom(R’) of 
isometries (i.e. distance-preserving maps) of R? consists of all translations x œ> x + a, 
all orthogonal maps (rotations and reflections) x œ> xA where AA‘ = J, and all combi- 
nations x A + b thereof. Likewise, the group Isom(H) of hyperbolic isometries consists 
of all Möbius, or fractional linear, transformations 


b 
pe Nabe de Rodthed =e S (2.1.4a) 
cz+d 


together with the reflection z +» —z, and all combinations thereof. As in the Euclidean 
case, Isom(H) is a three-dimensional real Lie group, with two connected components; 
the component Isomt (H) containing the identity consists of (2.1.4a), and is isomorphic 
to 


PSL»(R) := SL1(R)/ [+ é i) ; (2.1.4b) 


As in the Euclidean case, isometries preserve the absolute value |0| of angles; maps 
a € Isomt (Hi) preserve the angles themselves and so are conformal. Isometries preserve 
area and send hyperbolic lines to hyperbolic lines. PSL2(R) preserves everything of 
geometric significance and is thus the group of symmetries of the hyperbolic plane. 
Likewise, the group Isomt* (S?) of symmetries of spherical geometry is PSL2(C), acting 
on the Riemann sphere P!(C) by Mobius transformations. The symmetries PSL2(R) of 
H are precisely those transformations in PSL2(C) that send H to itself. The only reason 
this action by Möbius transformations of the 2 x 2 matrices on P! (C) or H U {ioo} might 
not look strange to us, is because familiarity breeds numbness. Much more natural is 


3 This is the same Lambert who proved the irrationality of x and e. 
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the action of n x n matrices on C”, and this induces their action on C”~! (together 
with a codimension-2 set of ‘points at infinity’) by interpreting C” as homogeneous 
coordinates for C’—! (Section 1.2.2). Specialising to n = 2 gives us the action (2.1.4a). 
In Section 2.4.1 we interpret (2.1.4a) using the multiplication of matrices in SL2(R). 

A model for n-dimensional hyperbolic geometry is the upper half-space H” := {(x;) € 
R” |x, > 0}, which is conformally equivalent to the interior of the unit n-ball, or to the 
upper (i.e. X,41 > 0) sheet of the hyperboloid x? pee +x? ty = 
angle is used, but the metric is ds? = (dx? +- -< + dx?)/x2. Hyperbolic lines con- 
sist of half-lines and semi-circles perpendicular to the boundary hyperplane xo = 0; 
hyperbolic planes in H” consist of half-planes and half-spheres perpendicular to the 
boundary hyperplane xo = 0. The hyperboloid model makes it clear that the isome- 
tries Isom(H”) of hyperbolic n-space is isomorphic to the group of those matrices 
A € On, 1(R) with An+1,n+1 = 1. The group Isom*(H") of conformal isometries is the 
Lorentz group SO,,,;(R)*, obeying in addition the condition det(A) = 1. Of course the 
Lorentz group SO3,;(IR)* is more famous in its incarnation as the symmetry of special 
relativity (Section 4.1.2). By identifying the boundary plane of H? with C, the group 
Isom* (H?) = SO3,;(IR)t can be naturally identified with the Möbius transformations 
PSL,(C). 

Recall Hilbert’s theorem from a few paragraphs ago. Although no surface embedded 
in R? can provide a model of the full hyperbolic plane, they can provide a model of 
a piece of that plane (i.e. be ‘incomplete’). This is accomplished by any surface of 
constant negative curvature. For example, consider the ‘tractrix’ — the path traced by a 
stone, initially placed at (0,1), pulled (‘tractored’) by a string of length 1 as we walk 
along the x-axis. Take the tractrix in the xy-plane and rotate it about the x-axis; the result 
is called the ‘pseudo-sphere’, and is a surface of constant negative curvature in R*. More 
generally, by a hyperbolic surface we mean a surface that is also a metric space (i.e. it 
has a notion of distance between points, and of arc-length), which is locally isometric to 
H (i.e. the open sets Vy in Definition 1.2.3 are taken to be in H C R?, and the transition 
functions ag are in Isom(H)). The pseudo-sphere is an example of a hyperbolic surface 
different from the hyperbolic plane; crocheting constructs several other examples [284]. 
Similarly, we can define hyperbolic manifolds of arbitrary dimension. We conclude this 
subsection with the classification of all hyperbolic surfaces. But first we need the notion 
of a Fuchsian group. 

As was discussed in Section 1.2.2, tori S! x S! arise from the quotient R?/L of 
the plane by a two-dimensional lattice. This construction is equivalent to the familiar 
depiction of a torus as a parallelogram with opposite sides identified. We discuss the 
Riemann surfaces in more detail next subsection, but a genus-g surface can be depicted 
by identifying appropriate sides in a 4g-gon (see Figure 2.2 for the situation with a 
genus 2 surface). This arises from making 2g circular cuts into the surface and flattening 
it out. But can we also interpret that 4g-gon as corresponding to some quotient of R?, 
generalising the R?/L construction of a torus? The answer is no — the group Isom(R?) 
doesn’t have a rich enough supply of discrete subgroups. We can interpret the 4g-gon 
as a quotient, but of the hyperbolic plane and not the Euclidean one. 


—1. Euclidean 
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Fig. 2.2 A genus 2 surface and its octagon. 


Definition 2.1.1 A Fuchsian group is a discrete subgroup T of SL2(R), i.e. one with 
b 

inf {(a — 1}? + b? +c? + (d — 1)*} > 0, where the infimum is over all (i 4 Zl 

inl. 


We identify a subgroup I of SL2(R) with its canonical projection T into PSL, (R), since 
these give rise to identical surfaces. Examples of Fuchsian subgroups are 


G= cos(zk/N) — sin(wk/N) 
NV |\ = sinrk/N) cos(ak/N) 


Gz = te D kez}, 


and the modular group SL? (Z). The latter is certainly the most interesting of these. 

Let I be a Fuchsian group. Most points z € H (i.e. all but at most countably many) 
are fixed only by the identity in I (why?). Let zo € H be any of those generic points. 
Define the set 


)lose<nt, VN = 1,2,..., 


Dr(zo) := {w € H| disty(Zo, w) < disty(y.z9, w) forall y €e T, y A +1}. 


So Dr (Zo) is the intersection of a number of hyperbolic half-planes. This set D = Dr (zo) 
is called a fundamental domain of T, as it satisfies the following properties: (i) it is open; 
(ii) each orbit l.z intersects D in at most one point, and every orbit intersects the closure 
of D in at least one point; (iii) the boundary 0D of D in H consists of at most countably 
many hyperbolic line segments. (In fact, as long as F is finitely generated, D can be 
chosen with boundary consisting of only finitely many segments.) 

For example, a fundamental domain for Gy consists of the points lying between any 
pair of hyperbolic lines intersecting at i with angle 27 /N. A fundamental domain for Gz 
is {z c H| — 5 < Rez < 5}. Choosing Zo = 21, we get the fundamental domain D for 
SL2(Z) depicted in Figure 2.3: the vertical sides are Re z = + 5 , and the circle is |z| = 1. 

Applying I to a fundamental domain D will tile the hyperbolic plane — see Escher’s 
Circle Limit LH, ... for examples. Since IT C Isomt (H), each of these tiles is an identical 
copy (a congruent translate) of D. All this holds as well in hyperbolic n-space — for 
example, an analogue of SL2(Z) for H? is SL2(Z + iZ). 
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Fig. 2.3 Two fundamental domains for SL2(2). 


Just as we constructed the torus by identifying opposite sides of the parallelogram, so 
we can obtain a surface by identifying the appropriate sides of the fundamental domain 
of a Fuchsian group T. This surface will be a realisation of the orbit space \H (we 
write I on the left because it acts on the left). Provided no y € F has fixed points in 
H (except for the trivial maps y = +/), the orbit space '\H will inherit the hyperbolic 
geometry of H and be a hyperbolic surface. 


Theorem 2.1.2 Any complete hyperbolic surface X is isometric to a surface of the 
form T\H where T is a torsion-free Fuchsian subgroup of PSL2(IR). Two such sub- 
groups Tı, T3 define isometric surfaces T,;\H and T3\H iff aP;a~! = T3 for some 
a € PSL,(R). 


‘Torsion-free’ means that all nontrivial elements of T have infinite order — see Question 
2.1.2(b). Almost all surfaces with a conformal or metric or complex structure are T \ H for 
some Fuchsian subgroup I’. An unexpected revelation of Thurston’s Programme is that 
something similar happens in three dimensions — see the review [497]. Any surface of 
genus g > 2 supports uncountably many different hyperbolic structures. By contrast, the 
Mostow Rigidity Theorem (1973) tells us that a connected compact oriented manifold 
of dimension n > 3 supports only one. 


2.1.2 Riemann surfaces 


Manifolds M, N are homeomorphic if there is a continuous map M —> N with continu- 
ous inverse. Compact connected orientable surfaces are characterised, up to homeomor- 
phism, by the genus g € N. A sphere is genus 0, a torus genus 1, and the double-torus 
of Figure 2.2 is genus 2. The surface of a wine glass or fork is topologically a sphere, 
while coffee mugs and keys are (usually) tori. A ladder with n rungs has genus n — 1. 
The surface of a pair of pants is genus 2, while that of a sweater is genus 3. 

A torus can be realised in many different ways. One is the Cartesian product S! x S! 
of circles (lay one circle horizontally, then from each point on it place a vertical circular 
rib perpendicular to it, filling out the torus’s surface). A complex curve of the form 
y? = ax? + bx? + cx +d is a torus (at least if the points at infinity are included), as is 
the quotient C/L of the complex plane with a two-dimensional lattice L (Section 1.2.1). 
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Fig. 2.4 Diophantus’ argument. 


If we drop the requirement that our surface be compact, then up to homeomorphism it 
is uniquely specified by two numbers: the genus g as above, and the number of punctures 
(or boundary components) n. For instance, a sphere with one puncture is homeomorphic 
to an open disc or equivalently the plane C. We see this when we pop a balloon: the 
sphere becomes a rather jagged-edged disc. A sphere with two punctures is a cylinder or 
annulus. 

The non-orientable surfaces have a very similar classification. For example, if we could 
create a P?(IR)-shaped balloon, then popping it would create a jagged-edged Mobius band. 
We always require orientability in this book. 

The surfaces we encounter have more structure than mere topology. If the surface 
= is in fact smooth (Section 1.2.2), then we are interested in their classification up to 
diffeomorphism. In this case though nothing changes, the surface is again parametrised 
by the genus and number of punctures: any surface & has a unique differential structure 
compatible with its topology. In order to obtain a finer distinction between the surfaces, 
we need to further enrich their structure. The easiest way to do this is by introducing a 
metric onto the tangent spaces, or give the surface a complex or conformal structure. More 
on the resulting Riemann surfaces shortly. Nevertheless, the genus remains the single 
most important invariant distinguishing Riemann surfaces. There are many qualitative 
differences captured by genus — we will give three of them. 

Diophantus [45] was a mathematical giant who lived in Alexandria in the second or 
third century A.D. He seems to have been the first Greek to regard fractions as legitimate 
numbers, and he was the first to use negative numbers (though only in intermediate 
arithmetical calculations, so probably didn’t believe their ontological reality), and the 
first to invent an abstract symbolism for algebra. The following (expressed in modern 
language) is how Diophantus found all Pythagorean triples, that is the integer solutions 
toa? +b? = 2. 

First, it’s enough to look for all rational solutions to the circle x? + y? = 1. Then the 
integers a, b, c can be recovered by clearing denominators. Consider a line through the 
point (0, 1) that intersects the circle at another rational point (r, s) (see Figure 2.4). Clearly 
this line must have rational (or infinite) slope st Conversely, consider any line through 
(0,1) with rational slope u: its equation will be y = ux + 1. Where does it intersect the cir- 
cle? We get 1 = x? + (ux + 1}? = (u? + 1)x? + 2ux + 1, i.e. x ((u? + 1)x + 2u) = 0. 
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So apart from our original point (0, 1), it will also intersect the circle at 


—2u 1-2? 
ee (ar a) 
As long as u is rational, so will be this point. Thus Diophantus found a parametrisation 
of all rational points on the circle, and hence all Pythagorean triples. 

His method is far more general than this, as he knew. In fact, consider any nondegen- 
erate conic. To find all rational points on it, we first find one rational point, and then 
consider all lines with rational slope through that point. This will exhaust all rational 
points on the curve. Thus if a conic has one rational point (it might have none), then it 
will have infinitely many, and all can be found explicitly. 

Why won’t this trick work for other equations of this sort? For example, Fermat’s Last 
Theorem challenges us to find a nontrivial rational solution to x” + y” = 1, for n > 2. 
If we draw a line through the obvious solution (x, y) = (0, 1), we simply get a mess. 
What’s so special, geometrically, about conics? 

The modern way (due to Bezout in the eighteenth century) to think about this is 
to regard the given equation, say x? + y? = 1, as an equation relating two complex 
numbers (x, y) € C?. The result will be a complex curve, that is a real surface. To which 
complex curve does x? + y? = 1 correspond? The real curve (a circle) is parametrised 
by x = cos 0 and y = sin 6, and a moment’s deliberation will convince oneself that 
permitting 0 to take complex values will exhaust all points on the complex curve. So 
writex = $(w+w!)andy = i(w — w`!) for any w € C except w = 0; this identifies 
the complex curve x? + y? = 1 with the complex plane punctured at 0, that is a cylinder. 
The unit circle in R? is merely the slice of this cylinder in C? by the plane passing through 
the two real axes of C?. A different slice will produce, for instance, an hyperbola. 

More generally, any polynomial in x, y defines a noncompact surface in C?. For 
example, a nondegenerate cubic y? = x? + ax? + bx +c is a once-punctured torus — 
explicitly, the quotient C’/(Z + tZ), where C’ means deleting from C the lattice points 
Z + TZ, is equivalent in every sense one could want (e.g. conformally) to the cubic 


y? = 4x3 — 60G4(t)x — 140G6(t), 


where the Eisenstein series G(T) is defined in (0.1.5). Similarly, the complex curve 
x? + y? = 1 corresponds to the torus C/(Z + tZ) with three points removed. 

In any case, we can now answer our question: What is so special geometrically about 
the conics, that Diophantus’ method works for them? The answer: They are (punctured) 
spheres, that is have genus 0. 

It will always seem that some points ‘at infinity’ are missing from these complex 
curves. Kepler back in 1604 knew that adding such points simplifies the geometry. We 
do this by projectifying the given equation (Section 1.2.2). For example, x? + y? = 1 
corresponds to the homogeneous equation x? + y? = z?, where we identify (x, y, z) and 
(Ax, Ay, Az) for A Æ 0. The two ‘infinite’ points, that is the points with z = 0, are then 
(1, +1, 0). Similarly, the three missing points on the Fermat curve x? + y? = 1 have 
homogeneous coordinates (x, y, z) = (1, —&, 0) for any third root of unity £. We see 
that in homogeneous coordinates the ‘infinite points’ don’t look so bad. 
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Fig. 2.5 Addition of points on a hyperbola. 


Fig. 2.6 Addition of points on a cubic. 


Another special property of conics (avoiding the infinite points) is that they are additive 
groups. Fix any point e on the conic C (it will be the identity); given any two finite 
points p,q on the conic, the sum p + q € C is defined to be the intersection with C 
of the line through e parallel to the line through p and g (Figure 2.5). Associativity 
follows from Pascal’s Theorem concerning hexagons inscribed in conics. For example, 
choosing the identity e = (1, 0) and the parametrisation (x(t), y(t)) = (cos(¢), sin(t)) of 
the circle x? + y? = 1, this addition of points corresponds to addition of angle t. The same 
conclusion holds for the hyperbola x? — y? = 1, with e = (1, 0) and parametrisation 
t > (cosh(f), sinh(t)) of the x > 0 branch. See Question 2.1.3. 

Better known is the addition of points on a nondegenerate (projective) cubic C. Fix 
any e € C (again it will play the role of identity), and choose any points p,q € C. Let 
r € C be the intersection with C of the line through p, q; the sum p + q is defined to 
be —r, that is the intersection with C of the line through r and e (see Figure 2.6). This 
also is commutative and associative, provided we include the points at infinity. Addition 
continues to work when the cubic is complexified, and that’s how to make sense of it: the 
resulting surface is a torus, equivalent to one of the form C/(Z + tZ) for some t € C, 
and this addition on the cubic lifts to ordinary addition on C. Incidentally, the addition 
of points is only one of a number of senses in which conics are toy models for the much 
richer theory of elliptic curves (i.e. cubics with a marked point e) [372]. 

The simplest quantitative distinction between surfaces of different homeomorphism 
type (g, 1) is the fundamental group x, defined in Section 1.2.3. For example, 71 (S?) = 1 
since $° is simply connected, and 71 of a torus is Z @ Z. Let £ g be a compact genus 
g > Osurface. Then 7;(%,) has presentation 


(Lg) = (a1, Æg, Bi, -- -o Bg | OrBiay By! +++ agBgaz'B'= 1). (2.1.5a) 


The generators a;, 8; are chosen as in Figure 2.2 (a; = a, pı = b, etc.). The easiest way 
to read off the genus from (2.1.5a) is to compute the abelianisation 7 /[71, 71] (which 
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equals incidentally the first homology group Hı (£, Z)); as is clear from (2.1.5a), it is 
the abelian group Z% generated by @;, B j- On the other hand, the fundamental group of 
a genus-g surface £, „ with n > O punctures is free (see e.g. page 64 of [103]): 


11 (Len) = Fogyni- (2.1.5b) 


The preceding discussion indicates the significance of genus. Now let’s impose more 
structure. A Riemann surface is a connected orientable surface with a conformal struc- 
ture, together with a choice of orientation. Equivalently, a Riemann surface can be defined 
as a complex analytic curve: any polynomial equation in x, y € C inherits the conformal 
and differential structure of C. This is because locally the conformal maps in R? are 
precisely the locally holomorphic maps in C with nonvanishing derivative (theorem 14.2 
of [481]). A third possible definition is that Riemann surfaces consist of those connected 
2-manifolds with a complete metric with constant curvature. As mentioned above, its 
homeomorphism class is given by its genus g and number of punctures n, and the surface 
is compact iff n = 0. We are primarily interested in compact Riemann surfaces. 

Any topological surface can be made into a Riemann surface, usually in a continuum 
of inequivalent ways (Section 2.1.4). We identify two Riemann surfaces if they are 
conformally equivalent, or holomorphically equivalent, or isometric. In Section 2.1.4 
we discuss the classification of Riemann surfaces up to conformal equivalence. 

The basic example of a Riemann surface is the complex plane C. Also important 
is the complex projective line P!(C) = C U {oo}; stereographic projection verifies that 
it is topologically a sphere, called the Riemann sphere. Now, a meromorphic function 
f : D — C by definition is holomorphic everywhere except for isolated poles; if f has 
poles at z;, then defining f(z;) = œ gives a conformal map f : D > P!(C) between 
Riemann surfaces (perhaps it is this picture, in which z; is sent to the ‘north pole’ 
oo, which is the origin of the term ‘pole’). Likewise, we can extend the domain of 
a function f on C to P!(C), provided it is meromorphic at oo. For example, if p is a 
polynomial of degree n, then p has a pole of degree n at oo, and we obtain a holomorphic 
map p:P'(C) > P!(C). By comparison, the functions e? and cos(z) have essential 
singularities at oo and so cannot be extended to P!(C). 

Historically, Riemann surfaces were introduced by Riemann to supply the maximal 
domain (via analytic continuation) of a holomorphic function. The problem is that many 
of the most natural complex functions are multivalued, for example f(z) = ./z or g(z) = 
log z or other inverses of nice functions. As we move counterclockwise along the unit 
circle |z| = 1, starting atz = 1, the value f(z) = /z changes continuously from f(1) = 
1 to f(1) = —1, and the value of g(z) = log z changes continuously from g(1) = 0 to 
g(1) = 277i. To Riemann, we should regard f(z) as a holomorphic function on a double 
cover D = D? U D' of the complex plane, and g(z) is holomorphic on a helix. As we 
move along the circle, the argument z of f(z) moves from the bottom sheet D? = C 
to the top sheet D' = C, and if we continue a second time around the circle, we return 
from the sheet D’ to D’. To identify D homeomorphically, cut both D? and D* from 0 
to oo, and glue the 6 = 0* slit of D’ to the @ = 07 slit of D‘ and vice versa. The result 
is homeomorphic to a sphere with one puncture, corresponding to the point at infinity. 
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Note that f : D — C is well-defined and holomorphic; it is an example of what we will 
shortly call a cover of C, ramified at z = 0. 

The remainder of this subsection describes an important realisation (called uniformi- 
sation) of any Riemann surface. The idea is simple. There are two different connected 
real curves, up to homeomorphism, and they are the line R and the circle S ' The circle 
can be realised as S! = R/Z. We call R the ‘universal cover’ S! of S', because it is 
simply-connected; Z here is the fundamental group 7;(S!). See also Theorem 1.4.3. 

The same works with surfaces. For example, the sphere with two punctures (a cylin- 
der) and a torus both have universal cover homeomorphic to C; the cylinder itself is 
homeomorphic to S! x R and the torus to C/(Z + iZ), where Z and Z + iZ are isomor- 
phic to their fundamental groups. Let’s make these ideas more precise, and incorporate 
as well the conformal structure. 


Definition 2.1.3 Let X&*, X& be two Riemann surfaces. We say that &* covers X by 
f if f: &* > È is a holomorphic map from X* onto X. If in addition f is locally 
conformal, we call f a conformal or unramified cover. If f : &* — È is a conformal 
cover, and X* is simply-connected, then we call X&* a universal cover of X. 


Let Uy C È, Gy : Ux > Va C C be a family of coordinate charts for = (Defini- 
tion 1.2.3); by local coordinates we mean the complex numbers z € Vy. In local coordi- 
nate z about point p* € &*,acover f sends a neighbourhood of p* to one of f(p*) € X 
with local coordinates a + cz”+ higher terms, for some constants a and c Æ 0. To be 
conformal, this order n must always be 1 (otherwise we say f is ramified at p*). 

If f : &* — È is a conformal cover, then the fundamental group z(X%*) is naturally 
isomorphic to a subgroup of mı (£) (Section 1.7.2). In this way, the covers &* of X (up to 
homeomorphism) are in one-to-one correspondence with conjugacy classes of subgroups 
of ;(%). A universal cover Š is the ‘largest’ and most important cover, and is unique up 
to conformal equivalence. It can be identified as the space of all homotopy-equivalence 
classes of paths on X with fixed initial point p € X. For example, visualise a ‘point’ 
p on S! as a curve starting at 1 € S! and ending at e? (0< 6 < 27), and wrapping 
around the circle (i.e. crossing 1 € S!) n times; the identification of S! with R comes 
from identifying this path with the number 6 + 2777 € R. 

We are now ready to state the basic result of this subsection. 


Theorem 2.1.4 (Uniformisation Theorem) 


(a) Up to conformal equivalence, the only simply-connected Riemann surfaces (i.e. the 
only candidates for a universal cover) are the sphere S? = P! (C) = C U {ow}, the 
plane C and the upper half-plane H. 

(b) Let X be any Riemann surface, and let ¥ be its universal cover. Then & is conformally 
equivalent to £/ T, where T = 1\(%) is a subgroup of the automorphisms of = that 
act on © without fixed points. A metric can be chosen for X with constant curvature 
+1, 0, —1, respectively, if Č = S*,C,H, respectively. Two surfaces £/ T, £'/ T’ 
are conformally equivalent iff the universal covers ¥ and Ē' are the same, and T 
and T" are conjugate subgroups in Aut(). 
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Table 2.1. The universal covers of the genus g surfaces with n punctures 


g\n 0 1 2 >3 
0 s? C,H C,H H 
1 C H H H 
>2 H H H H 


Of course H and C are homeomorphic, but they aren’t conformally equivalent (replacing 
H with the disc D, this follows from Liouville’s Theorem: a bounded holomorphic func- 
tion on C must be constant). Part (a) is due to Klein, Poincaré and Koebe. These three 
possibilities for £ correspond respectively to the three geometries: spherical, Euclidean 
and hyperbolic. The group of automorphisms of Š is just Isom*. The condition that T 
acts without fixed points (apart from the identity in I”) is significant — fixed points change 
the geometry. A famous example of an orbit space with fixed points is SL2(Z)\H, which 
has conical singularities at i and e™!/?. 

Table 2.1 gives the universal cover of any Riemann surface, as a function of the genus 
and number of punctures. We see there that almost every surface is hyperbolic: the 
generic geometry in two dimensions is hyperbolic. 

The Uniformisation Theorem easily proves Picard’ s Theorem (‘the range f (C) of any 
holomorphic nonconstant function f : C —> C can omit at most one point from C’). The 
proof, which the reader can fill in, uses Liouville’s Theorem together with the fact that 
the universal cover of the twice-punctured plane is 


2.1.3 Functions and differential forms 


The last subsection gives several equivalent notions of a Riemann surface. Here we 
see that any compact Riemann surface is the locus of a homogeneous polynomial 
f(a, b, c) = 0 in the complex projective plane P?(C). 

We study a manifold through the functions living on it. Two manifolds differing merely 
by a single point can have a completely different family of functions. For instance, we all 
know many examples of holomorphic functions on C. But the only functions holomorphic 
on C and also holomorphic at oo are the constants. More generally, any noncompact 
Riemann surface X has several functions f : X — C holomorphic everywhere, while 
if X is compact, the only holomorphic functions f : & — C are the constants. We are 
more interested in compact ©. 

Given any Riemann surface X, let K(X) denote all the meromorphic functions f : 
E — C — equivalently, all holomorphic functions f : £ —> P!(C) (by convention we 
discard the constant function f = oo). Let Uy C È, gy : Ux > Va C C be a family of 
coordinate charts for X. Then f € K(X) iff each f o gz! is a meromorphic function of 
the local coordinate z € Vy. 

For example, K(P!(C)) consists of all rational functions f(z) = pro , While (C) is 
much larger. This space K(X) is in fact always a field; its algebraic structure determines 
the surface & (up to conformal equivalence) and naturally mirrors all aspects of X. A 
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compact Riemann surface & has genus 0 iff K(X) = C(z), the field of rational functions 
in some variable z. For positive genus, two generators are needed. 


Theorem 2.1.5 Let & be a compact Riemann surface of genus g > 0. Choose any 
nonconstant function f € K(X). Then there exists another nonconstant function g € 
K(%), such that K(£) = C(f)[g], i.e. for some n € N, any h € K(X) can be written in 
the formh = ys, ai(f ) g', where a;(z) are rational. Moreover, there is an irreducible 
polynomial P(z, w) such that P(f, g) = 0, and such that K(X) is isomorphic as a field 
to the quotient C(z, w)/(P(z, w)) of the algebra of rational functions in z, w by the 
ideal generated by polynomial P. Moreover, writing P as a homogeneous polynomial in 
three variables, £ is conformally equivalent to the complex curve P = 0 in the complex 
projective plane P(C). 


For a proof and more material on Riemann surfaces, see [180]. It is nontrivial that 
we can embed any Riemann surface into the complex projective plane. In fact, most 
complex n-tori C”/L (where L C C” is a 2n-dimensional lattice), for n > 1, cannot be 
embedded in any projective space (Section 6.3.2). The plane curve P = 0 will typically 
have ‘singularities’, that is points where all three partial derivatives vanish, where the 
curve self-intersects transversely. These singularities can be “blown up’, that is the two 
intersecting ‘complex strands’ (i.e. open discs in C) can be separated, but this requires 
the complex curve to be embedded in P’, not P?. 

Every geometric feature (except the choice of orientation) of the surface & has an 
algebraic analogue in K(X), and hence the geometry of X can be studied via algebra. For 
example, a C-algebrahomomorphism F : K(X’) > K(%) lifts toa holomorphic map F: 
x — YX’. This general observation is the starting point of both algebraic geometry and 
noncommutative geometry. For example, the space of smooth complex-valued functions 
on a manifold M will be an infinite-dimensional commutative algebra, since the target 
C is a commutative algebra. Connes suggests that we study a noncommutative algebra 
as if it too is the algebra of functions on some manifold. The hope is that this should be 
directly relevant to quantum theories, since we access space-time only indirectly, via the 
functions (‘quantum fields’) living on it. We seem to get into problems in quantum field 
theory when we take too literally the (naive and improbable) intuition that space-time 
is anything like a manifold. In any case calculus in noncommutative geometry formally 
resembles quantum mechanics (e.g. the role of coordinates is played by self-adjoint 
operators — observables — and infinitesimal distance ds by the fermion propagator). 

For a concrete example of Theorem 2.1.5, consider the torus T, = C/(Z + tZ). A 
meromorphic function f : Te —> C lifts to a meromorphic function (which we also 
call f) on C, with periods 1 and t. That is, f € K(7,) iff f : C —> C is meromor- 
phic and f(z +m-+nt)= f(z) Vz € C, Ym,n € Z. Any such doubly-periodic mero- 
morphic function is called an elliptic function, for fairly obscure reasons.* We know 


4 One of the more carefree creative outlets for mathematicians is through their happy role as nomenclators. 
Elliptic functions first arose historically as the functional inverse of a certain class of integrals called 
‘elliptic integrals’. This class got its name since it included the integral computing arc-lengths of ellipses. 
Likewise, the name ‘elliptic curve’ for a genus-1 complex curve arose since the functions living on it are 
those elliptic functions. There is however no direct relation between ellipses and elliptic curves. 
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any nonconstant f € K(T,) must have at least one pole in the ‘fundamental parallel- 
ogram’ P, with corners at 0,7, 1, 1 + t. Moreover, the contour integral Je f about 
the parallelogram C = ðP, vanishes by periodicity, so the sum of residues of f inside 
P, must vanish. Hence any nonconstant elliptic function must have at least two poles 
in P}. 

We can construct an elliptic function by averaging f(z) = oe g(z+m-+nt) for 
any function g over each orbit z + Z + tZ. As the simplest possibility for a noncon- 
stant elliptic function would have a single pole of order 2 at the lattice points, it is 
tempting to take g(z) = z~?. Unfortunately, for large m, n, (z +m +nt)~ is close to 
(m +nt)~, and so its sum over all m, n won’t converge. Thus we are led to consider its 
‘regularisation’ 


[o0] 
pe) =z + JO {etma m+n’) (2.1.6a) 
m,n=—œ0 
function, called the Weierstrass function (although Eisenstein knew of it years earlier), 
where )~’ here means to avoid m = n = 0. Its derivative 


pe)=-2 X @+mtnty> (2.1.6b) 


m,n=— 0 


is also elliptic. Being meromorphic functions on a compact Riemann surface, p and p’ 
must be polynomially related: we find 


P'E? = 4E) — e1 XPE) — eP) — e3), (2.1.6c) 


where e; = p(1/2), e2 = p(t/2) and e3 = p((1 + T)/2). This is shown by verifying that 
(p — e1)(p — e2)(p — e3)/p’ has no poles and hence must be constant. Together, p and 
p’ generate K(T,): we can write any elliptic function f € K(T,) as Ri(p)+ p’ R2(p), 
where Rj(p(z)) is the even part (f(z) + f(—z))/2 of f and p’(z) Ro(p(z)) the odd part. 
T, is conformally equivalent to the projective curve with ‘finite’ points (p(z), p’(z), 1) € 
P?(C), together with the ‘infinite’ point (0, 1, 0) corresponding to the pole of p and p’ at 
z=0. 
One way to embed Riemann surfaces into projective space uses theta functions: 


O, s(T, Z) = by exp[zrit (m + ry’ +2ri(m +r)(z + s)], (2.1.7a) 
mEZ 


for any r, s € Q. These functions and their generalisations are central to Moonshine, but 
for now note that they converge for all (t, z) € H x C to a function holomorphic in both 
t and z. These 0, s are nearly doubly-periodic in z: ifr, s € iZ then 


6,.5(t, 2 + Nm + tNn) = exp[—wiN7n?t — 2miNnz] 0, s(t, Zz), (2.1.7b) 


for allm, n € Z. Apart from a constant root of unity, 8, s depends only on the values of r 
and s mod 1. Enumerate the N? pairs (r;, si) € Zy x zn. Then for any N and any 
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t c H, the map from T, to PY *-1(C) defined in homogeneous coordinates by 
Zk (Ois (T, Nz), Oy5,9)(T, Nz), a ) € PN’-1(C) 


is well defined (to see that this N?-tuple can never be the 0-vector, find explicitly the zeros 
of 6,.;). This map is one-to-one, that is it embeds the torus Ty as a complex submanifold 
of PN =e): We can specify this submanifold more explicitly (in the simplest case, 
namely N = 2) by the homogeneous polynomials 


60,0(T)°24 = 90,1/2(8)°25 + O1/2,0(7)°23, — 90,0(T)°24 = O1/2,0(T)°23 — 90,1/2(T)° z3, 


where (z1, Z2, 73, z4) € P3(C) are homogeneous coordinates and 6,.5(T) = 0, s(t, 0). The 
fact that the image of T, satisfies those equations follows from the Riemann theta iden- 
tities. Moreover, any elliptic function f : Te —> C can be written in the form 


Oe ll 60,0(T, Z — ai) 


izi= bo.o(T, Z — Bi) o(t, z — Bi)’ 


for arbitrary complex numbers a;, b;, c subject to the relation $`; a; = $; b;. The Weier- 
strass p-function can be written 


2 


d? T 
p(z) = — 2 8121/2, z)— ae 


For any k € Z, a holomorphic (respectively meromorphic) k-form w (Section 1.2.2) 
on a complex curve © looks like f dz‘ in local coordinates, where f is holomorphic 
(respectively meromorphic). If we change local coordinates zı œ> g2(¢; (21), then 
(1.2.4b) becomes 


fee) = re a E A iuh (2.1.8) 


For example, dz is a meromorphic (but not holomorphic) 1-differential on P! (C) (it has a 
pole of order 2 at 00). Let H* (£) be the vector space of holomorphic k-forms, and M* (£) 
be the space of meromorphic ones. Given any w, œ € M*(X), w’ not identically 0, the 
ratio w/a’ lies in the function field K(X). Of course, as vector spaces M°(X) = K(M). 
For any surface © and integer k, M*(X) is infinite-dimensional, but for any compact 
surface £ and any integer k, the Riemann—Roch theorem implies that H*(£) is always 
finite-dimensional and may be 0. 


2.1.4 Moduli 


In physics, the phase space lets us consider all possible states of a physical system; the 
actual time-evolution of a given instance of that system will be a curve in phase space. 
Likewise, we often want to consider simultaneously families of manifolds, rather than 
fix a single manifold. For example, last subsection we treated all tori T, simultaneously. 
The role of phase space is played by a moduli space, the space of orbits of a group of 
diffeomorphisms of a geometric structure placed on a manifold. A path on the moduli 
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space connecting orbits [p] and [q] is a continuous deformation from the geometric 
structure on p to that on q. 

The notion of moduli space for surfaces is due to Riemann, who also computed 
its dimension. The idea is to consider the space 92(X) of all conformal equivalence 
classes of Riemann surfaces homeomorphic to a given surface Xo. As Xo is completely 
characterised by the genus g and number n of punctures, we also denote this by Ws n. 
With a few exceptions mentioned shortly, Wt, „ has complex dimension 3g — 3 +n. 
However, these moduli spaces usually aren’t manifolds (they have conical singularities). 
It was for this reason that Teichmiiller introduced a cover, now called the Teichmiiller 
space Z, „n. The moduli space is recovered by the quotient Wt, n = Len/ T g,n, where ls» 
is a discrete group called the mapping class group (see Definition 2.1.6). Teichmiiller 
space is much better behaved than the moduli space — it is a complex manifold (except 
for certain small g, n), and as a real manifold is diffeomorphic to R°’~°+?”, 

As we shall see, there’s a small number of pairs (g, n) that don’t behave com- 
pletely generically for one reason or another: namely, (0, 0), (0, 1), (0, 2), (0, 3), (0, 4), 
(1, 0), (1, 1) and (2,0). We mention some of their individual peculiarities below. 

In order to anticipate the definitions, consider a torus T (so g = 1, n = 0). For con- 
creteness (this doesn’t lose any generality), restrict to tori coming from a parallelogram 
in the complex plane C, with one pair of opposite sides labelled ‘1’, and the other 
pair labelled ‘2’; the torus is recovered by first identifying the opposite sides labelled 
‘1’, and then identifying the opposite sides labelled ‘2’ (changing this order changes 
the shape — though not the conformal class — of the torus). By translating, rotating and 
rescaling this parallelogram, we can put the vertices at 0, 1, t and t + 1, for some t € H, 
where the horizontal sides are labelled ‘1’, which continuously deforms the torus without 
changing its conformal equivalence class. This is the best we can do, if we restrict to 
continuous deformations. The resulting parameter space, namely the upper half-plane 
H, is the Teichmüller space {19 for the torus. The torus corresponding to t € H is 
T, = C/(Z+ Zr). 

However, different points t in H can correspond to conformally equivalent tori. For 
example, we can cut the torus open along the seam ‘2’, twist the open arm m complete 
turns, and then sew it back up. This amounts to replacing parameter t with t + m. As 
long as m is an integer, this is a conformal diffeomorphism of the torus (if m isn’t an 
integer, this map isn’t even continuous). Thus the points t + Z all correspond to the same 
conformal structure. Similarly, cutting open seam ‘1’ and giving the upper cap n complete 
twists before resewing corresponds to replacing the parallelogram 0, 1, t and t + 1 
with the parallelogram 0, 1 + nt, t and (n + 1)t + 1 — after putting it into canonical 
form, this replaces t with t/(nt + 1). Both these twists are called Dehn twists. We can 
also switch the roles of sides ‘1’ and ‘2’, which replaces t with —1/t (why?). More 


generally, the tori corresponding to parameters t and arte are conformally equivalent, 


b 
for any (3 )) € SL,(Z). This accounts for all redundancies in the parametrisation 


by H of the conformal equivalence classes of tori. The orbit space SL2(Z)\H is the 
‘moduli space’ Mı o for the torus. Note that Wt o has conical singularities at the orbits 
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[t] = [i] and [e27'/7], corresponding to those tori with additional automorphisms. This 
happens in higher genus too. Indeed, any finite group G is the automorphism group of 
some surface of sufficiently high genus. For example, there will be a compact Riemann 
surface with automorphism group exactly the Monster M, though it will have genus at 
least 9.6 x 10°). 


Definition2.1.6 Let Xo be a fixed Riemann surface. Consider all pairs (X, f), where f 
is an orientation-preserving homeomorphic map of Xo onto X. Write (2, f) ~ (2, f) 
if there exists a conformal homeomorphism h : £ —> X' such that the homeomor- 
phism f! oho f : £o —> Xo is homotopic to the identity. The set of these equivalence 
classes is the Teichmüller space {(X0). The mapping class group F (£o) is the quotient 
Homeo,(X9)/Homeo (Xo) of the group of orientation-preserving self-homeomorphisms 
f of Xo, by the (normal) subgroup consisting of those homotopic to the identity. 


For example, I") 9 = SL2(Z) and T1, = H; as we explain in Section 2.2.4, the moduli 
space Mi o is a punctured sphere. Because C/(Z + tZ) can also be interpreted as a 
torus with a special point, namely the additive identity 0, we also have T1, = H and 
I, = SL,(Z). For a different reason, we also have To, = H and To,4 = SL2(Z). 

The basic idea, illustrated above, is that the Teichmüller space T, „ accounts for 
‘continuous’ conformal equivalences, while the mapping class group I’,,,, contains the 
left-over ‘discontinuous’ ones. To help make this important but abstract definition more 
accessible, consider the following artificial example. Let X = R?, and suppose the addi- 
tive group G = Z x R acts on X by addition. Then G is a disconnected Lie group 
with connected components G, := {n} x R for each n € Z; the component Go is the 
one containing the identity (0, 0). The group 7) = G/Go = Z interchanges the compo- 
nents in the obvious way. We can mod out first by the continuous part Go of G (which 
should be relatively easy), then by the discontinuous zo: the orbit space X/G is then 
(X/Go)/m0 = R/Z = S'. Of course, here X plays the role of the infinite-dimensional 
space of all conformal structures, G plays the role of all conformal homeomorphisms, 
and X/G is the moduli space. The identity component Go corresponds to the homeo- 
morphisms homotopic to the identity, 79 is the mapping class group and X/Go is the 
Teichmiiller space. 

The mapping class groups are central to our story, so we’ll try to make them more 
accessible. More details and proofs are provided in [56], [270], [60] and chapter 4 of 
[59]. A simple presentation of the mapping class group I’, , for n = 0, 1 — the cases of 
greatest interest to us — is given in [550]. 

Pg, n acts like a braid group. For example, any f € Homeo (£) permutes the n punc- 
tures, so the same is true of y € I n; the ‘pure’ mapping class group PT, „n consists of 
those y € I; n that fix each puncture. Then PT, is normal in F, „ and has quotient 
Pon/Pl gn = Sn. 

A braid group &,,(%) can be associated with any surface X in the obvious way [59]. 
For genus g > 2 and any n > 0, the group I’, , is an extension of B,(X,), by the group 
I, 9. For genus g = | andn > 2, T4, is an extension of the quotient B,,(2))/Z(B,(21)) 
by PSL,(Z), where the centre Z(B,(2)) = Z?. For genus g = 0 and n > 3, the group 
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Po,» is isomorphic to the quotient B,(S?)/Z(B,(S*)), where Z(B,(S*)) = Z2. For any 
n, To,, is ahomomorphic image (i.e. a quotient) of the braid group B,. 

Let & be a compact Riemann surface. To any simple closed loop y in X, we can 
define the Dehn twist about y, by cutting out from X a neighbourhood of the loop 
homeomorphic to a cylinder, giving one end of this cylinder an integral twist, and gluing 
it back. The Dehn twists about the 2g elementary loops a;, bj defined in Section 2.1.2 
generate the mapping class group of X. 

Teichmiiller space need not be connected. In particular, there are three different kinds 
of twice-punctured spheres: one is flat and has conformal structure given by the cylinder 
C/Z; one is the punctured disc 0 < |z| < 1 and corresponds to the half-cylinder H/(z => 
z + 1); and finally, we have the family of annuli A, := {r < |z| < 1}, which are all of 
the form H/(z + Az) for A > 1. Thus To,2 and Mo,2 consist of two isolated points and 
an open line segment (0, 1) say. 9.2 = Z2 consists of the identity, and the inversion 
through 0 that exchanges the two boundary circles. Similarly, both {o,; and Mo,ı consist 
of two isolated points. 

The mapping class group usually (but not always) acts faithfully on Teichmiiller space 
(a faithful action means that the only group element that acts trivially is the identity 
element). Tio = T11 = To, are exceptions: —I € SL,(Z) acts trivially on H. Also, 
consider the thrice-punctured sphere P!(C)/{z,, z2, z3}. As is well known, Aut(S yx 
PSL,(C) can conformally move any three points to any other three points, so we can send 
21,22, 23 € P!(C) respectively to 0, 1, 00. Thus To,3 consists of a single point. However, 
we could have moved, for example, z2, Z1, 23 instead to 0, 1, ov, respectively. A total of 
six different choices could have been made, corresponding to the mapping class group 
10,3 = S3, which acts trivially on Teichmüller space. 

M, n is simultaneously the moduli space of: (i) conformal equivalence classes of real 
surfaces; (ii) complete Riemannian metrics of constant negative curvature on real sur- 
faces; and (iii) complex-analytic structures on complex curves. This is an accident of 
small dimensions, for example the Mostow Rigidity Theorem says that in three dimen- 
sions the moduli space of (ii) consists of a single point. 

A different approach to moduli spaces ties in with Sections 2.3.5 and 6.3.2. First, 
by the Siegel upper half-space H we mean the space of all symmetric g x g complex 
matrices Q whose imaginary part Im(Q) is positive-definite — that is, v' Im(Q) v > 0 for 
any nonzero column vector v € R8. H, is a higher-genus generalisation of H. The role of 
the group SL2(Z) here is played by the symplectic group Sp2,.(Z), that is the group of all 
determinant 1 2g x 2g matrices M satisfying M' (o o) M = (5 a where 
I = 1, and 0 are, respectively, the g x g identity and g x g zero matrices. The familiar 


action y a T= arth is replaced by the action 
A B ai A B 
E p) 2=U2+BXC2-+ D) ; (é p ) € SPa, VQ e Hy. 


(2.1.9a) 
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The generalisation of the Jacobi theta function (2.1.7a) is Siegel’s theta function 


A(Q,z) = > exp(zin' Qn + 27in - z), (2.1.9b) 
neZs 
which converges for all Q € H, and z € CS. 

Where does H, come from? Associate with a compact genus-g surface X, its Jacobian 
variety, as follows. The space H!(Z,) of holomorphic 1-forms is g-dimensional, so let 
{@1,..., @g} be a basis. Fix any base-point p € X,; then we get a map from X, x +- x 
xX, to C! by integrating: 


8 
(V1, -++59g) (/ or for f as), 
i 3 Ci Ci Ci £ 


where C; is any path on £, from p to q;. Of course the result depends on which paths 
C; are chosen, and so isn’t well defined as a function of g;’s alone. However, consider 
the set L of all possible values Se O Se wg) € C%, where C runs over all possible 
closed loops in X, passing through P. Then our ill-defined map X, x --- x Ly > C8 
will become well-defined (i.e. independent of the choice of path C;) if we replace the 
target Cë with C%/L. It isn’t hard to show that L is a 2g-dimensional lattice (in fact a 
basis is given by the values on the 2g loops we call a;, 6; in (2.1.5a)), and so C! /L 
is a 2g-dimensional torus, called the Jacobian variety Jac(X,). This map X, x --- x 
x, — C8/L is holomorphic and surjective (‘Jacobi Inversion’). Restricting it to the 
diagonal embedding q +> (q,...,g) € Lg X +--+ X Lg, we get a one-to-one conformal 
embedding q +> F(C,...,C) of X, into Jac(Z,). When g = 1, X; and Jac(2)) are 
identical; when g > 1 the embedding is into a proper submanifold of the Jacobian (check 
dimensions). 

Now, we can select our basis œw; of 1-forms so that the integral fo w; equals ô;j. 
This choice means that our lattice L contains Z£. The remaining basis vectors of L 
are ( J, p Olsens f 6: Œg) € C8, and it can be shown (the ‘Riemann bilinear relations’) 
that these basis vectors will be column vectors of a symmetric g x g matrix Q whose 
imaginary part is positive-definite — that is, the period matrix Q lies in H,. So the lattice L 
becomes Z£ + QZ and the Jacobian becomes To := C8 /(ZE + QZ*), where we regard 
vectors in Zë and C£ as column vectors. Different choices of bases correspond to the 
Sp29(Z)-orbit of Q. 

So every surface £, corresponds to an Sp2,.(Z)-orbit in H,. The Schottky Problem asks 
which points in H, arise as period matrices. Call this subset €,. Our moduli space Wt, o 
can be identified with €, /Sp2, (Z) and Sp,,(Z) is a homomorphic image (or quotient) 
of I’,,.9. Since the symplectic group Sp2.(Z) is much more accessible than the mapping 
class group I, 9, the main difficulty is to find a nice characterisation of €, and the kernel 
of T,0 —> Sp,(Z). For a formal solution to the Schottky problem, see e.g. [12]. 

The moduli space Wt, „ is rarely compact. A very naturalway to compactify Wt, ,, due 
to Deligne and Mumford, is fundamental to conformal field theory. Consider first the 
complex curve w? = (z — 2) (z + 1 — æ) (z — 1 — a), wherea isa parameter. Provided 
a Æ 0, +1, this is a genus-1 nonsingular curve, conformally equivalent to the torus 
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(a) (b) (c) (d) 


Fig. 2.7 The surface w? = (z — 2)(z +1—a)(z+1+4+a). 


C/(Z + tZ) where 


(a? + 3)? — (2a? — 2} 

(2-1+a@)(2+1—a@)(1-a@—-—1+a)?’ 

We know that W o is real-diffeomorphic to a sphere with one point removed. As we vary 
a, we move through 9t 9, and as œ — 0 we approach the missing point. What happens 
to the curve in that limit? In Figure 2.7(a)—(c) we intersect our curve, fora = 1/2, 1/4, 0, 
respectively, with the plane R? C C?. Figure 2.7(d) gives a picture of the complex curve 
ata = 0: itis a pinched torus. We call the nonsmooth point (z, w) = (—1, 0) anode. This 
is the surface to which the boundary point of Mi o corresponds. Including it, compactifies 
Mio to Mo = S?. 

More generally, we add to each moduli space IN, , the surfaces X with nodes. These 
are connected compact spaces where the neighbourhood of any point either looks like 
C (i.e. © is smooth there) or like zw = 0 at (0, 0) (these are the nodes). We say £ has 
type (g, n) if unpinching each node results in a genus-g surface with n punctures — for 
example, Figure 2.7(d) has type (1,0). We require these surfaces to have the following 
property: when you delete all nodes and the surface falls into connected pieces, none of 
those pieces is a sphere with one or two punctures (the only exception is that we also 
allow a torus with one node). These surfaces are called stable, because they have a finite 
automorphism group (this terminology is explained by visualising a marble versus a dice 
on a tabletop). As we know, the larger the automorphism group, the worse the singularity 
is in moduli space. 

The moduli space Wt, „n is compactified if we include the conformal equivalence 
classes of stable type (g, n) surfaces with nodes. The resulting space IN, , is called the 
moduli space of stable surfaces. A nice review is given in [447]. For example, the moduli 
space Mo,4 is also a sphere with one missing point. That missing point corresponds to 
pinching a sphere with four punctures into two spheres, each with two punctures. 

Moduli spaces of curves seem first to have been introduced into string theory and con- 
formal field theory by Polyakov in 1981, and have played an important role there ever 
since. We are actually more interested in an enhanced moduli space, obtained by decorat- 
ing Riemann surfaces with additional structure. Many more or less equivalent alternatives 
have appeared in the literature. In particular, let & be a compact genus-g surface, possi- 
bly with nodes, with n marked points p; € & (none of which are at a node). About each 
point p; is chosen a local coordinate z;, vanishing at p;, identifying a neighbourhood 


I@= 
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Fig. 2.8 The Dehn twists on the torus with one marked point. 


of p; with a neighbourhood of 0 in C (see section 2.1 of [530] for details). We call 
this data (£, {p;}, {z;}) an enhanced surface of type (g, n). It is essentially equivalent 
to removing a disc from & about p; and choosing a parametrisation about the boundary 
circle. We call two enhanced surfaces (È, {p;}, {zi}) and (X, {p;}, {z;}) equivalent if 
there is a conformal equivalence A : E — X’ such that h(p;) = p; and zi(hx) = z;(x) 
locally about p;. The resulting moduli space Men will be infinite-dimensional, but the 
mapping class group Ton will be an extension of the usual T, „ by Z”. 

These groups Toh are of great interest to us — for example, a rational conformal field 
theory gives a projective finite-dimensional representation of each of them. This yields 
the braid group representations in quantum groups or Jones subfactor theory, as well as 
the modularity of Moonshine. They are discussed, with many examples, in section 5.1 
of [32] (where they are denoted F, „n, and what we call I’,,, is denoted there Ty) For 
example, I"); is the braid group 53, a central extension of SL2(Z) by Z. It is generated 
by the Dehn twists depicted in Figure 2.8. We return to this in Sections 4.3.3, 5.3.4 
and 7.2.4. 

The main reason we prefer extended surfaces to ordinary Riemann surfaces is that 
there are canonical ways to sew them together. This sewing operation is fundamental in 
conformal field theory, because it permits us to decompose a higher-genus surface into 
discs and ‘pairs-of-pants’ (Section 4.4.1). 


Question 2.1.1. How would a hyperbolic mathematician model the Euclidean plane? 


Question 2.1.2. (a) Let y = k 
C 


d 
from the extended upper-half plane H U R U {00} to itself. Show that: 


b 
) € SL,(R), y # +I. We can regard y as a map 


(i) |a + d| = 2 iff y has exactly one fixed point on the boundary R U {oo}, iff y can 
be conjugated in SL2(R) to the translation z œ> z + t; 
(ii) |a + d| > 2 iff y has exactly two distinct fixed points on the boundary R U {oo}, 
iff y can be conjugated in SL2(R) to the dilation z œ> Az; 
Gii) |a + d| < 2 iff y has exactly one fixed point in H, iff y can be conjugated in 
cos(0)z+sin(0) 


SL2(R) to the rotation z > ~~ in(öjz Lcos) about i with fixed point i. 


(b) Suppose T is a Fuchsian group. Prove that y € T has a fixed point in H iff y has 
finite order. 


Question 2.1.3. Explain how the addition of points on a conic is a degenerate case of the 
addition of points on a cubic. 
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Question 2.1.4. Find all rational solutions (r, s) tor? — 2rs +r + 2s — s? = 0. Verify 
that, for the choice of identity e = (0, 0) and addition defined as in Figure 2.6, the rational 
points form a subgroup. As an abstract group, what is this subgroup isomorphic to? 


Question 2.1.5. Using the conformal map z > (x, y) = (p(z), p’(z)) between C/(Z + 
tZ) and the cubic y? = 4(x — e1)(x — e2)(x — e3), verify that the addition of points on 
the cubic corresponds to the addition zı + z2 (mod Z + tZ) in C. 

Question 2.1.6. Identify Mo,4 with a space of S3-orbits in C \ {0, 1}. 


Question 2.1.7. Let G be a finite group. Define 


K(g,h) = ay x dim(p) chp(gh~'), 
for g, h € G, where the sum is over all irreducible representations p of G. 
(a) Verify that K (g, h) = dy. 
(b) For any y € N, take f : G? > G by 


(gi hi, ---, 8y, hy) = giligy hy gohrgs hy! -> Byhygy'hy' 
Define J = È ee; hija? K(f (gi, hi), e). By evaluating J in two ways, obtain the formula 


[Hom (£), O)| = G17”! $ dimo)”, 


p 


where &,, is a compact genus-y surface. 


2.2 Modular forms and functions 


Number theory, at its most elemental level, is concerned with finding integer solutions 
to various (systems of) equations. It is truly remarkable how this seemingly pedestrian 
pursuit has resulted in the creation of the richest and deepest mathematics. Indeed, it 
is tempting to suspect that beneath any spot on the mathematical turf, no matter how 
remote or seemingly barren, is a gemstone merely requiring hard work and discerning 
fingertips to unearth. 


2.2.1 Definition and motivation 


As we saw in several different contexts in Section 2.1, the group SL2(R) of 2 x 2 matrices 


with real entries and determinant 1 acts on the upper half-plane H = {t € C | Im(t) > 0} 
—1 
by Möbius transformations (2.1.4a). For example, the matrices s := 1 0 and 


1 1 
eS ( 0 1 ) correspond to the functions t + —1/t and t > t + 1, respectively. 


Consider T = SL2(Z), the subgroup of SL2(R) consisting of the matrices with integer 
entries. It is generated by s and t: 


sia) =((9 ake: 1 )) = Wt lst 6.6608 =) (2.2. 1a) 


Modular forms and functions 127 
Because —/ € SL>(Z) yields the trivial map in H, weare also interested in the group 


PSL2(Z) = SL2(Z)/{£1} = (s, t |s? = (st) =e) =: Zo * Z3, (2.2.1b) 


the free product of Z} with Z3. Groups like IF act on the extended upper half-plane 
H := H U {ico} U Q in the obvious way (e.g. s interchanges 0 and ioo). The extra points 
{ico} U Q are called cusps because of the hyperbolic triangle R in Figure 2.3, which 
points at one of them. Cusps correspond to tori with a single node (Figure 2.7(d)), and 
compactify the moduli space W o. 

Recall Definition 0.1: a modular function for lis a meromorphic function f : H —> C, 
symmetric with respect to F. A related definition is: 


Definition 2.2.1 A modular form f for T = SL2(Z) of weight k € Q and multiplier 
u:T > C, |u| = 1 is a holomorphic function f : H —> C, which is also holomorphic 
at the cusps Q U {ico} and obeys the transformation law 


at +b = a b 7 a b 
(Z=) =u (6 a) (ct +da) f(t), v(¢ J eT, (2.2.2) 


For fractional k we choose the branch of the kth power to be the principal one (so x* > 0 


when x > 0). For number-theoretic purposes, we require the values of K to be roots of 
unity. Writing u(t) = e?7"", we can expand f in powers of q: f(t) = q" XZ 
By ‘meromorphic at ioo’ we mean that all but finitely many negative n have a, = 0, 


n 
n=— 00 anq 7 


so f has a pole of finite order at q = 0; by “holomorphic at ico’ we mean h > 0 and 
an = 0 for all negative n. Meromorphicity or holomorphicity at the other cusps is implied 
by that at ico, because of (2.2.2) and the fact that all cusps lie in the same SL2(Z)- 
orbit. 

For the significance, which is considerable, of the condition that f be meromorphic at 
the cusps, see Question 2.2.1. The moduli spaces Wt 9, Wty; and Mo, 4 all are SL,(Z)\H. 
The cusps Q U {ico} of H correspond to pinched tori or spheres (Section 2.1.4). Mero- 
morphicity at the cusps says f respects this surface degeneration in the appropriate 
way. 

If the weight k is an integer, the multiplier u will necessarily be a one-dimensional 


representation of I’; when Ñ is rational, u will be a projective representation. We define 
projective representations, and explain what to do with them, in Section 3.1.1. An intrigu- 
ing implication for fractional k is described in Section 2.4.3. 

The function f is called a modular form because f(t) d~*/*r is aholomorphic (—k/2)- 
form on the space SL2(Z)\H;; by contrast, a modular function f isameromorphic function 
on the space SLo(Z)\H. 

The easiest examples of modular forms of weight k > 4 (k even) are the Eisenstein 
series G, defined in equation (0.1.5). It is conventional to normalise them as follows: 


Ex(t) := agt t)=1 Fe i(n)q" € Ziq], (2.2.3a) 
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where B, are the Bernoulli numbers, defined by the generating function =| = 


epee By x and where o;x_;(”) and the Riemann zeta function ¢ (s) are defined by 


On(n) = X d", (2.2.3b) 
d\n 
w=) ns [[ a-poy' (2.2.3) 
n=1 p prime 


(see Section 2.3.1). The Æ; and G; have multiplier u = 1. 

Indeed, the Eisenstein series generate all modular forms for SL2(Z) with trivial multi- 
plier u. More specifically, the span of all such modular forms (over all k) is a ring graded 
by k (i.e. the product of modular forms of weight k and k’ is one of weight k + k’). This 
ring is generated (over C) by the Eisenstein series F4(t) and F(t) — that is, any level 
k modular form f can be written as a polynomial (homogeneous in the obvious sense) 
in E4 and E6. Moreover, E4 and E6 are algebraically independent, so that polynomial 
is unique. Using this we can readily compute the dimension of (and find a basis for) the 
space of weight k modular forms. For instance, a basis for the weight 24 modular forms 
is {E4, EZE}, E$}. 

The definition of modular forms seems fairly arbitrary. For example, one may ask how 
significant the upper half-plane H is, or where the factor (ct + d)* in (2.2.2) comes from. 
We confront this in Section 2.4.1. But for now note that Definition 2.2.1 (like Definition 
0.1 before it) also makes perfect sense if SL2(Z) is replaced by any Fuchsian group I" 
that sends the cusps Q U {ico} to themselves. The only (minor) complication is that the 
cusps may not lie in the same orbit. See, for example, [352] for the proper definition. We 
are interested in IT commensurable with SL2(Z), that is, T O SL2(Z) has finite index in 
both I and SL,(Z). Typical choices for T are the congruence subgroups 


roy ={(¢ 7) € Ska) (g =+ 0) mod M}, (2.2.4a) 


MoV) := He J € SL2(Z) | c = 0 (mod v) ; (2.2.4b) 


for any N € N. Incidentally, for N > 1, r(N)/{Æ1} is always free (i.e. isomorphic to 
some Fn), while ro(N ) may or may not be free. 

It is not at all obvious that modular forms and functions should be interesting, but in 
fact they are unavoidable in modern number theory. For example, consider the question 
of writing numbers as sums of squares. We can write 5 = 17 + (—2)? = (-1? + 1? + 
0? + 17+ (—1)’, to give a couple of trivial examples. Let N,„(k) be the number of ways 
we can write the integer n as a sum of k squares, counting order and signs. For example, 
Ns(1) = 0 (since 5 is not a perfect square), N5(2) = 8 (since 5 = (£1)? + (42) = 
(£2)° + (£1)*), Ns(3) = 24, etc. Their generating functions are: 


(oe) 


Nn(k) x" = O(x)*, 
n=0 


5 A fundamental principle in mathematics is: whenever you have a subscript with an infinite range, make a 
power series (called a generating function) out of it. 
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where 


Ox) = 1+ 2x + 2x4$---= Sox” 
neZ 
is called a theta function. It turns out that 0 transforms nicely with respect to SL2(Z), 
once we make the change-of-variables x = exp[zit] (what we usually call ./q). Write 


1 2 
63(v) for 0(x). Then 63 is clearly invariant under the action of ( oi) and a little work 


0 -l1 
(done next subsection) shows that ( 1 0 ) takes 63(7) to Jt 63(t). Together those 


two modular transformations generate the group 


ai /1 2 b 
pe aa ENC 7) € Ska) lac = bd = 0 mod D}, 


(2.2.5) 


63 is a modular form of weight 5 and nontrivial multiplier for Tg. 

Jacobi introduced that important change-of-variables x = exp[zit] two centuries ago, 
in his analysis of elliptic integrals. His theory is poorly remembered today, which is very 
disheartening considering how much of modern mathematics is touched by it. Have a 
look at the book [94], written over a century ago; the style of mathematics in our time 
is rather different from that in Jacobi’s, and we’ve lost a little in innocence what we’ve 
gained in power. See also the beautiful book [414]. Let’s briefly sketch Jacobi’s theory. 

Just as we could develop a theory of ‘circular functions’ (i.e. sine, etc.) starting from 
the integral s(a) = de Tired so we can develop a theory of ‘elliptic functions’ starting 


from the elliptic integral F(k, a) = fy ASE 


both more useful and with nicer properties than s(a): we call it sin(u). Similarly, for 
any k the elliptic function sn(k, u) is defined by u = F (k, sn(k, u)). Just as we can 
define a numerical constant 2 by sin($7) = 1 (ie. ir = h 7) we get a function 


K(k) = IA Te Just as sin(u) has period 44r), so sn has u-period 4K (k). sn 
also turns out to have u-period 4i K (k’) where k’ = ~ 1 — k? — today we take this as the 
starting point and define an elliptic function to be doubly periodic or, what is the same 
thing, to be a function on a torus (Section 2.1.3). 

Theta functions aren’t elliptic functions, but they are closely related, as we see in 
Section 2.1.3. In Jacobi’s language, we have 


6; (Ze) £ [2K (k) 
“\ K(k) T 
The ‘modular transformation’ t b> = interchanges the ‘modulus’ k with the ‘comple- 
mentary modulus’ k’, and is completely natural in Jacobi’s theory. The important formula 
BE) = Jt 03(T) is trivial here. Closely related to this is Poincaré’s remarkable path 
to modular functions (Section 3.2.4). 


Surprisingly, many seemingly innocent questions can be dragged (usually with effort) 
into the richly developed realm of elliptic curves and modular forms, where they are often 


. Inverting s(a) gives a function 
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solved. For instance, we all know the ancient Greeks were interested in Pythagorean 
triples: integer solutions a, b,c to a? + b? = c’, or equivalently right-angle triangles 
with rational side-lengths (Section 2.1.1). 

There are two ways of pushing this. One is to ask which n € Z can arise as areas of 
these rational right-angle triangles. It turns out n = 5 is the smallest one: a = 3, b= 2 
c = 2 works (5 = $(3)() and (3)? + (2)? = (4). This is a hard problem — just try 
to show n = 1 cannot work. The number n = 157 works, though the simplest triangle 
has a and b as quotients of integers of size around 107°, and c as the quotient of integers 
around 10%. Although this problem was studied by the ancient Greeks and also by the 
Arabs in the tenth century, it was finally cracked in the 1980s by first translating it into 
the question of whether the elliptic curve y? = x? — n?x has infinitely many rational 
points. 

The other extension of Pythagorean triples is more famous: find all integer solutions 
toa” + b” = c”. 350 years ago Fermat wrote in the margin of a book he was reading (the 
book was describing at that point Diophantus’ classification of Pythagorean triples) that 
he had found a ‘truly marvelous’ proof that forn > 2 there are no nontrivial solutions, but 
that the margin was too narrow to contain it. This result came to be known as ‘Fermat’s 
Last Theorem’® and despite considerable effort no one has succeeded in rediscovering his 
proof. Most people believe that Fermat soon realised his ‘proof’ wasn’t valid, otherwise 
he would have alluded to it in later letters. In any case, a very long and complicated proof 
was finally achieved in the 1990s: the ‘Taniyama—Shimura conjecture’ says that a certain 
function associated with any elliptic curve over Q will be modular; if a” + b” = c" 
for some n > 2 and nonzero integers a, b, c, then the elliptic curve y? = x? + (a” — 
b")x* — a"b" will violate that conjecture; finally, Wiles proved the Taniyama—Shimura 
conjecture. 

A certain interpretation of modular functions also indicates their usefulness, and 
explains the adjective ‘modular’. The moduli space of tori is SL: (Z)\ H (Section 2.1.4). 
So if we have a complex-valued function F on the set of all tori, which associates the 
same value to conformally equivalent tori (an example is the genus-1 partition function 
(4.3.8b) in conformal field theories), then F is a function F : H — C, symmetric with 
respect to SL.(Z). 

Likewise, suppose we are interested in meromorphic functions f : X — C living on 
some surface &.We know from the last section that almost every surface X is a quotient 
x = I\H, for some Fuchsian group T. Then f can be lifted to a meromorphic function 
on H with symmetry I. 

What is the meaning of the Fourier expansion? Think of the parameter g as the local 
coordinate about the cusp ioo. The Fourier expansion is simply the local expansion of 
f about that cusp. There is a similar expansion about any other cusp x € Q. In the case 
of SL2(Z), all cusps are equivalent, but for smaller groups the cusps typically fall into 


6 Tt was called his ‘Last Theorem’ because it was the last of his 48 margin notes to be proved by other 
mathematicians — a different margin note is discussed in Section 1.7. The story of Fermat’s Last Theorem is 
a fascinating one, but alas this footnote is too small to do it credit. See for instance the excellent book [508]. 
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several distinct orbits, and the corresponding expansions carry independent information. 
These coefficients are often quite interesting (e.g. they may give the numbers of solutions 
to various equations, or the dimensions of certain subspaces). The modular form f is a 
holomorphic interpolation between this local information. 


2.2.2 Theta and eta 


Two modular forms that appear throughout the following are the Jacobi theta function 
63 and the Dedekind eta function 7: 


CO CO CO 
A(t) := 1429 qg = [| +g?) TJa-4, 2-26a) 
m=1 


n=1 n=1 


m=—C 


oO oO 
n(t) = q!” [[a _ q”) = q!” >» (—1y"g Om +2, (2.2.6b) 
n=1 


The equality in (2.2.6a) comes from the denominator identity (3.4.5b) for AP , While that 
in (2.2.6b) comes from Euler’s pentagonal identity; in both cases the first expressions are 
more important. We saw 6; last subsection. Unlike the Eisenstein series, its modularity is 
not obvious. It can be established though in a number of ways, the most familiar perhaps 
being Poisson summation. This says that for any rapidly decreasing smooth function 
g:R— C (g is in the Schwartz space S(R) of Section 1.3.1), 

Xem) = J gm), (2.2.7a) 


neZ meZ 


where g is the Fourier transform of g: 
o9 N 
gO) = I eo g(x) dx. (2.2.7b) 
=% 


Choose g(x) = e77"? with t € R, so t = it € H; then @(y) = J/1/fe”’" and we 
obtain (by analytic continuation to all t € H) the transformation formula for 63 under 


TH —l1/t: 
—1 T 
03 (=) = ,/ — 03(T). (2.2.7c) 
T i 


63 is a modular form for Ig (2.2.5) of weight 1/2 and nontrivial multiplier. Both Poisson 
summation and its application to (2.2.7c) are due to Gauss. In Question 2.2.4 you are 
asked to prove Poisson summation, and next subsection we try to understand what it is 
saying. In Sections 2.3.1, 2.3.4 and 2.4.2 we give alternate proofs of (2.2.7c). 

The modularity of 7 can be summarised by 


mt + 1) = &4n(7), (2.2.8a) 


—1 T 
n (=) =,/7 (7), (2.2.8b) 
T i 


where £4 = exp[27i/24]. 
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More generally, we get the complicated transformation law 


7 (= È 7) = ula,b,c,d)Vcet+dn(t), Y (i ) € SL(Z), (2.2.80) 


ct+d 


where, for c > 0, u(a, b, c, d) = exp(zi (4 — 5 — s(d,c))) for the Dedekind sum 


cal e . y 
s(d,c) =~ (Z Z| 5) (2.2.8d) 


i=l 


For c = 0, the transformation follows immediately from (2.2.8a), while for c < 0 an 
analogue to (2.2.8c) holds. The denominator of the rational number s(d, c) will always 
divide 6c; u will always be a 24th root of 1. Although Dedekind sums have many special 
properties [468], we find in Section 2.4.3 a much cleaner way to write (2.2.8c). In any 
case, 7 is a modular form for SL2(Z) of weight 5 and nontrivial multiplier. 

Once again, (2.2.8a) is immediate from the definition (2.2.6b) and isn’t deep. There 
are several arguments in the literature that establish (2.2.8b), including Poisson sum- 
mation applied to the series in (2.2.6b). Here is another, which is instructive for other 
reasons. In the following paragraph, let’s not be distracted by mere analytic concerns, 
like convergence or interchanging integrals and infinite sums. 

Fix t = it, t > 0. The expression 


-1 fosso — 1)(63(is/t) — 1) ds (2.2.9a) 


is manifestly invariant under the transformation t > 1/t. Applying the transformation 
(2.2.7c) to 63(is /t) and expanding out both 63’s, we get 


-4 f (È acme) (\ (: + ay ere) = ) ds (2.2.9b) 
=1 nt 


Soy" f a he 1 > zst? ds 1 5 fiers 
== z = e Eaa = : 
s 2 ZA s 


f=1 n=1 


Now, replace the indefinite integral here with fe The third term in the right-side of 
(2.2.9b) is independent of t (to see this, change variables: y = ts) and so is a constant. 
The second term can be evaluated explicitly: 


oo 2 


LO? caste 1 1 la 1 
> wt ds = = =Z. 2.2.9 
oy PNS Ene me eS 


t=1 


To simplify the first term of (2.2.9b), replace s with x? and apply the identity 


a ra 2 -2 
et ab I eal eo —bx dx 
T Jo 


(this is identity 3.325 of [258]) with a = ztl, n = mtn. The first term becomes 


v X] oo 
T2 5 +s wee es © 5 log(1 L e727"), 


f=1 n=1 n=1 
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Putting these together, we get 


1 [e] 
-3f (3(ist) — 1)(63(is/t) — 1)ds = log nt) + x + ly +C 


for some constant C. 

Two unfortunate remarks should probably be made regarding this calculation. First, it 
would imply (2.2.8b) holds without the prefactor ./t /i. Second, the constant C diverges, 
as does the integral in (2.2.9a). Calculations like this mellow somewhat one’s disdain for 
analysis. The way to proceed is to ‘regularise’ (2.2.9a) by subtracting from the integrand 
near s = 0 the term s~! responsible for the divergence. This results in the identity 


L. f 1 f! 
log nat) = -;/ (Cist) — 1)(63Gs/t) — 1) ds — if (@3(ist) — 1)(03Gs/t) — 1) 
1 


mt T 
12 12% 
In Question 2.2.5 the reader is asked to fill in the details, proving (2.2.9d) and thus 
(2.2.8b). We see from this argument that the mysterious power 1/24 in (2.2.6b), required 
for the modularity of 7, in fact equals ¢(2)/(2z X. 

At least in spirit, this calculation is reminiscent of the regularisation of Feynman 


s7! ds 


1 
gles t. (2.2.9d) 


integrals in quantum field theory (Section 4.2.3). For example, the Dedekind eta arises 
in the calculation of the one-loop partition function of a boson compactified on a circle 
(see e.g. section 8 of [246]). The normalisation factor there involves the product of the 
nonzero eigenvalues of the Laplacian + & on the torus C/(Z + tZ): namely the 


modulus-squared |D|? of 
IE 
Da)= || =a-rmn), (2.2.10a) 
(m,n)4(0,0) "2 


where t2 = Im(t) > 0. This expression diverges enthusiastically, but it is to be inter- 
preted using the substitutions (zeta-function regularisation) 


CO CO CO 
[[e = gO = gt, ||" =e %'O = Qn), | Je" =a) =g-n, 
n=1 n=1 n=1 
(2.2.10b) 
where ¢ here is the Riemann zeta function (2.2.3c). It is found that 
D(t) = 2t n(t)’. (2.2.10c) 


In this ‘derivation’ of n, the exponent 1/24 in (2.2.6b) equals —¢(— 1)/2. Since the values 
¢(—1) and ¢(2) are related by the functional equation (2.3.2), they are indeed equiva- 
lent. Also, note that (2.2.10a) obeys D(t + 1) = D(t) and D(—1/t) = D(t)/T, while 
(2.2.10c) obeys D(t + 1) = e™'/6 D(t) and D(—1/t) = —iD(t)/T. Thus the identifi- 
cations (2.2.10b) don’t preserve modular behaviour. It is somewhat reminiscent of the 
—s~! regularisation in (2.2.9), which breaks the t <> 1/t symmetry. 

Prefactors q” as in (2.2.6b) are very common, as we shall see later with the characters 
of Kac—Moody algebras or vertex algebras. In Monstrous Moonshine, this is the q7! with 
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which the j-function begins. These factors are a little mysterious — for example, why 
start the grading in (0.3.1) at — 1 rather than 0 — and there are several explanations (Sec- 
tions 3.1.2, 3.2.3 and 5.3.4). The point of our little digression into string theory is to intro- 
duce its term conformal anomaly for this factor q” . In physics, an anomaly is a symmetry 
of a classical system that is broken in its quantisation. Here, the tT +> t + 1 symmetry 
(an aspect of conformal invariance) of D(t) is broken by regularisation, an anomaly. 

We see in (2.2.3a) that the coefficients of the g-expansion of Eisenstein series are 
interesting. In fact, we are usually more interested in the coefficients of a modular form 
than in the function itself. A classic example of this is the theta series of a lattice. Let 
L cC R” be any n-dimensional positive-definite lattice (Section 1.2.1), and choose any 
vector t € R”. Define 


OL (Tt) = 5 gree, (2.2.11a) 


xet+L 


In words, the coefficient of q” is the number of vectors in t + L with length /2r. For 
example, Oz = 63. Let L be rational (i.e. for all u, v € L we have u - v € Q) and t have 
finite order m in L (i.e. mt € L). Then Poisson summation again yields 


-1\ CDS a 
ou (=)= FG d En Ots+tol (2.2.11b) 


where (as always) Em := exp[27i/m],s € L* satisfies s - t = + (mod 1) (why must such 
a vector s always exist?) and where Lo = {u € L* |u -t € Z}. In particular, 


s (+)-e (t) (2.2.11¢0) 
È = = Viel L*(T). .2.1 Ic 


Definition 2.2.2 Let T be a finite set, and suppose for each i € T we have a function 
fi(t) meromorphic in H and with q-expansion f;(t) = open a, iq”, such that for each 
N only finitely many r < N have nonzero coefficients a,i. We call the set {f;(t)}ier a 
vector-valued modular function for SL2(Z) with multiplier p : SL2:(Z) > GLz(C) if, for 
each A € SL(Z) andi € T, we have 


b 
fi (= 2 ) = F pA); F0). 


ct+d ‘cL 


The strange condition on the a, ; simply says that each f; is meromorphic at tT = ioo. 
Vector-valued modular forms are studied in, for example, [350]. By the usual argument, 
p will be a ||Z||-dimensional representation of SL2(Z). We are interested in the case 
when the matrices o(A) are unitary. In this case, at least when the functions f(T) 
are linearly independent, a vector-valued modular function for SL2(Z) defines a flat, 
holomorphic, Hermitian vector bundle over Mı o: namely, the diagonal quotient (H x 
span{ f;(t)})/PSL2(Z). The fibre above any point in Mı o will be ||Z||-dimensional, 
except possibly for the singular points [i] and [e7'/?]. The f; are holomorphic sections 
of this bundle. 
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A classical property of theta functions, apparently due in this generality to Hecke 
in 1940, anticipates beautifully what we see later in this book in more and more 
generality. 


Theorem 2.2.3 Let L C R” be any n-dimensional positive-definite lattice. 

(a) Suppose for all v € L that v - v € Q. Let t € R” be any vector with finite order in 
L: i.e. mt € L for some nonzero m €E Z. Then the theta series ©,,4,(t), divided by 
n(t)", is a modular function for some T(N). 

(b) Suppose further that L is an even lattice (i.e. all v - v lie in 2Z), and let L* be its dual. 
Write ti + L,i =1,...,M, for the finitely many cosets in L*/L. Define a column 
vector X(T) with ith component ©,,41(t)/n(t)". Then X, forms a vector-valued 
modular function for SL2(Z). 


For the proof of part (a), see theorem 20 of [456]. Part (b) follows quickly from (2.2.11b) 
and (2.2.8). This theorem can be interpreted as being a special case of Theorem 3.2.3 
below, when g is the affinisation of the reductive (abelian) Lie algebra C”. Note, however, 
that the functions in (2.2.11a) are linearly dependent, and so the matrices p(A) are not 
uniquely defined by (b). The easiest way to get linear independence is by adding variables 
(Section 2.3.2). 

The Leech lattice A (Section 1.2.1) is to lattices much as the Moonshine module V ' is 
to VOAs (see Section 7.2.1 below). It has no length-squared 2-vectors, and has precisely 
196 560 length-squared 4-vectors — a number remarkably close to the monstrous 196 883. 
Indeed its theta function ©, (T), when divided by nT, equals J (t) + 24. Is this another 
example of Moonshine, on par with McKay’sequation (0.2.1a)? 

Indeed it is. However, for the Leech lattice A, we can quickly identify ©, (T) in terms 
of J (t) (see Question 2.2.7). Although the 196560 ~ 196 884 coincidence is thus easy 
to explain, it nevertheless turns out to be an instructive example of Moonshine. 


2.2.3 Poisson summation 


Theta series (2.2.11a) are sums, over periodic sets, of the exponential of a quadratic 
polynomial. According to the argument given last subsection, two ingredients go into 
their modularity: together with Poisson summation (2.2.7a), we also needed the fact 
that the Fourier transform of the Gaussian e~™*” is essentially itself. Poisson summation 
requires the infinite periodic sum. There are many other simple functions f that are 
likewise nearly invariant under Fourier transform: for example, the Fourier transform 
over R? of f(x, y) = e'/Y y-2/3sign(y) is if (x, —y/27). For several other examples, 
see [176]. To see how to use this to get ‘cubic’ analogues of theta functions (which will 
transform nicely with respect to SL3(Z)), as well as possible applications to physics, see 
the intriguing review [462] and references therein. 

What is the other ingredient, Poisson summation, really saying? Meaning arises from 
anatural embedding of the particular into a more general context, so let’s try to generalise 
Poisson summation. 
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First, let G be a group — we require it to be a topological group (separable and locally 
compact). As defined in Section 1.5.5, its unitary dual G consists of all unitary irreducible 
representations. For example, the unitary duals of R and Z can be identified with R and 
S!, respectively, while the unitary dual of compact groups (like finite G or G = S!) 
consists of a discrete set of points. ` When the group is abelian, the representations m € G 
are all one-dimensional; the dual G itself forms an abelian group, and Pointrjagin duality 
says that the double-dual Gi is isomorphic to G. For example, the representations in R 
look like y (x) = e?7"* for each à € R, so R = R. When G is non- -abelian, Pointrjagin 
duality becomes the more abstract Tannaka—Krein duality of Section 1.6.2. 

Let us begin with abelian groups. Let T be a (discrete) subgroup of an abelian group 
G, such that the quotient \G is compact. The theta series modularity arguments last 
subsection correspond to the choices G = R and F = Z and, more generally, G = R” 
and T = L; of course the circle Z\R and the n-torus L\R” are compact. 

The Fourier transform f +> Ffor the group G — explicitly, Fiy) =f g f@) wr) dx - - 
is a unitary map taking Schwartz functions on G to Schwartz functions on the dual G. 
Incidentally, the integrals here and below are with respect to the invariant Haar measure 
(Section 1.5.4). Then the classical Poisson summation (2.2.7a) becomes 


[rmer= fe fib)dy, (2.2.12) 


where T+ consists of all y € G such that w(y) = 1 for all y € r. The integrals here 
reduce to sums, thanks to discreteness. It is through T+ that the dual lattice L* enters 
into (2.2.11c). Since Z+ = Z, we find that (2.2.12) is indeed a generalisation of (2.2.7a). 
(2.2.12) is too easy a generalisation to help us much.The meaning of Poisson summa- 
tion, and of (2.2.12), becomes a little clearer when we generalise to non-abelian groups. 
Let I’ now be an arbitrary discrete closed subgroup of a separable locally compact group 
G. G and T may or may not be abelian. For simplicity we assume that the coset space 
T\G is compact. Then T\G has a finite invariant measure, and the space L?(T\G) of 
square-integrable functions forms a Hilbert space (Section 1.3.1). The regular represen- 
tation R of G on L?(P\G) is defined by (R(x) f)(y) = f (yx), as usual, and is unitary. 
This representation decomposes as a direct sum of irreducible unitary representations: 


L°\G) = O,eĝMnT, 


where the numbers my > 0 are the (finite) multiplicities. 

Even though R is infinite-dimensional, we can define a character for it as follows. For 
any sufficiently nice function ¢ on G (e.g. @ smooth and of compact support), define the 
operator R(p) = fg #9) R(y) dy on L?T\G) by 


(RN) = | 60) Fey)dy. 


This assignment ¢ +» R(@) forms a representation of the algebra of smooth functions 
with compact support, with multiplication given by convolution ¢ « ¢’. The trace of an 
operator is defined to be the sum of its eigenvalues. It can be shown that the trace tr R(@) 
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exists, and in fact equals 


X mtn) = $ vot, \G,) f (x !yx) dx, (2.2.13) 
reĝ yeT Gy\G 

where T is a set of conjugacy class representatives in I’, and I’, and G, are the sta- 
bilisers of y in T and G, respectively (e.g. T, = {g € | gyg7! = y}). The left side of 
(2.2.13) is obviously spectral, that is involves eigenvalues. The right side is geometric; 
the integral over G,,\G is called an ‘orbital integral’. Equation (2.2.13) has an immediate 
generalisation: replace the regular representation R of G on L?(T\G) with any represen- 
tation of G induced from a finite-dimensional unitary representation p of I’. The trivial 
representation of I yields the regular representation R. [20] gives the straightforward 
proof of (2.2.13) as well as other generalisations. 

In the abelian case (e.g. G = R”, I = L), all my = 0 or 1 and T+ consists of all 
x €G with My = 1, and (2.2.13) reduces to (2.2.12). In effect we have reinterpreted 
the Fourier transform FW) by fixing Y € G and varying the function f, as a sort 
of character value for the (possibly infinite-dimensional) irreducible representation y. 
Another special case of (2.2.13) is to take the group G to be finite, in which case it 
reduces to Frobenius reciprocity. Interesting finite group applications are described in 
chapters 22-25 of [522]. 

Equation (2.2.13) is called the Selberg trace formula; there is a more complicated 
version (due in fuller generality to Arthur) when T\G is noncompact (in which case 
there are continuous parts to the spectrum). Selberg (a 1950 Fields medalist) was most 
interested in the case where G = SL,(R) and, for example, F = SL(Z), which has 
noncompact quotient. For this G he found explicit expressions for the orbital integrals, 
and the resulting trace formula has powerful consequences. 

The Selberg trace formula (2.2.13) can be thought of as an expression for the character 
of the regular representation of G on L?(T\G). This expression is geometric in the 
sense that for typical groups, the quantities on the right-side typically have geometric 
interpretations (e.g. for G = SL2(R), and T a Fuchsian group acting without fixed points, 
the orbital integrals can be expressed using lengths of closed geodesics on the compact 
Riemann surface I \H). Of course these orbital integrals, and hence much of the potential 
geometry, are trivial in the abelian group case used last section. 

Although Poisson summation, and its generalisations like the Selberg trace formula, 
play a central role in the theory of automorphic forms and Langlands programme, they 
have only played sporadic roles so far in Moonshine and conformal field theory. For 
example, [130] applies the Selberg trace formula to string theory, to find the trace of 
the heat kernel. Orbital integrals also play a fundamental role in the approach [346] to 
understand group representations via coadjoint orbits; I. Frenkel extended this method 
to express the characters of affine Kac-Moody algebras as orbital integrals [198], and 
in this way obtained new proofs of the Macdonald identities. It seems unlikely though 
that Poisson’s and Selberg’s formulae can provide a unified explanation of all modu- 
larity proofs in Moonshine. A rigorous proof in mathematics may be too slick, much 
as a painting can be too photographic. It seems to this author that, although Poisson 
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summation permits a quick proof of theta function modularity, it doesn’t tell us why it’s 
true. A conceptual proof should open the door to natural generalisations of the given 
theorem, by underscoring the confluence of properties needed for that theorem to hold. 


2.2.4 Hauptmoduls 


Let’s identify the orbit space SL2(Z)\H, by studying the fundamental domain D of 
Figure 2.3. Apart from the boundary of D, every SL2(Z)-orbit will intersect D in one and 
only one point. But what should we do about the boundary? Well, the edge Re(t) = -4 
gets mapped by the translation T : t œ> T + 1 to the edge Re(t) = 5, so we should 
identify these, i.e. glue them together. The result is a cylinder running off to infinity, 
with a strange lip at the bottom. The inversion S : t œ> —1/r tells us how we should 
close that lip: identify ie and ie~'’. This seals the bottom of the cylinder, so we get an 
infinitely tall cup with a strangely puckered base. In fact the top of this cup is also capped 
off, by the cusp ico. So what we have (topologically speaking) is a sphere. It inherits 
the smoothness of H except for conical singularities at the fixed points i and e”'/>. The 
cusps are responsible for compactness. This interpretation of SL2(Z)\H means that a 
modular function for SL2(Z) can be reinterpreted as a meromorphic complex-valued 
function on this sphere. There is a canonical sphere in complex analysis, namely the 
Riemann sphere P!(C) = C U {oo}. The meromorphic functions on the Riemann sphere 
must be rational, that is of the form f(w) = eae ae where w is the complex 
parameter on the Riemann sphere. So a modular function f(t) for SL2(Z) is simply some 
rational function P /Q evaluated at the change-of-local-parameters, or at the uniformising 
function w = c(t) that maps us from our sphere '\H to the Riemann sphere. There are 
many different choices for this function c(t), but the standard one is the j-function:’ 
(14+ 2400° ox(n)q")> OR _ 


estes | e 
ee aag sey a a 
p (2.2.14) 


(see also (0.1.8)), where o3 is in (2.2.3b), Og, is the theta series of the Eg root lattice 
(2.2.11a) and ņ is the Dedekind eta (2.2.6b). Thus, any modular function for SL2(Z) can 
be written as a rational function f(t) = P(j(t))/Q(j(t)) in the j-function. Conversely, 
any such function is modular. 

This is analogous to (and much stronger than) saying that any function g(x) periodic 
under x +> x + 1 is really a function on the unit circle S! C C evaluated at the uni- 
formising function x +> e°™'*, and hence has a Fourier expansion )~, gn exp[2zinx]. 

We can generalise the argument that led to j. Recall (2.2.4). 


Definition 2.2.4 Call a discrete subgroup T of SL2(R) a congruence subgroup if it 
contains some F(N ). Call it of moonshine-type if it contains some To(N ), and obeys 


& J ef stez. (2.2.15) 


7 Historically, j was the standard choice, but in Monstrous Moonshine the preferred choice would be the 
function J = j — 744 with zero constant term. 
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The congruence subgroups are relatively rare among finite index subgroups of SL2(Z), 
but their theory is much better developed. Let f be a modular function for a congruence 
subgroup I’. Then we can expand f as a Laurent series in g!/". We analyse this as 
before: look at the orbit space © = T \H; because I is not too big, £ will be a Riemann 
surface; because I is not too small, & will be compact. 

We call I ‘genus g’ if its surface & has genus g. If T is a subgroup of (1) = SL2(Z), 
and without loss of generality we have —/ € I’, then the genus is given by 


n n2 n3 Noo 
gat 4 a Oe 
where n is the index ||[((1)/T|| of in (1), and where n; (k = 2, 3, o©) is the number 
of T -orbits of order-2k fixed points. For the easy proof from the Hurwitz formula, see 
proposition 1.40 of [505]. Note that næ is the number of punctures of F \H. For example, 
for T = SL:(Z) we have n = 1 = m = n3 = Nn% and we recover our result that the 
genus is 0. The values n, n2, n3, Noo for all F(N) and To(N) are given in Section 1.6 
of [505]. 

For example, I = [ọo(2) and = I9(25) are both genus 0 (with 2, respectively 6, 
punctures), while I9(50) is genus 2 with 12 punctures and I9(24) is genus 1 with 7 
punctures. Once again, we are interested here in the genus-0 case. As before, this means 
that there is a uniformising function Jr that is a modular function for I’, and all other 
modular functions for I can be written as a rational function in it. Because of (2.2.15), 
we can choose Jr to look like 


(2.2.16) 


Jr) =q +a Tq +T) +: 


So Jr, the Hauptmodul for T, plays exactly the same role for I that J := j — 744 plays 
for SL2(Z). For example, ro(2), To(13) and Po(25) are all genus 0, with Hauptmoduls 


Ja(t) = q7! +276q — 2048 q? + 11202 g? — 49152qf + 184024q°+---, 


(2.2.17a) 
Ju) =q! -q +2q° +q° +24 -2q° -2q — 2q +g? +--+, (2.2.17b) 
Jst) =q! -q +q +q q!!! -q +q” +q” -qE (2.2.17c) 


The smaller (sparser) the modular group, the smaller the coefficients of the Hauptmodul. 
In this sense, the j-function is optimally bad among the Hauptmoduls: for example, for 
it a23 © 10%. 

In Theorem 2.1.5 we see what happens in genus > 0: two generators, not one, are 
needed, although they will be polynomially related. 

As is mentioned in Chapter 0, Monstrous Moonshine is interested directly in genus-0 
groups. We construct certain functions associated with the Monster, and it turns out 
unexpectedly that these functions are actually Hauptmoduls. 

An obvious question is, how many genus-0 groups (equivalently, how many Haupt- 
moduls) are there? It turns out that To(p) is genus 0, for a prime p, iff p — 1 divides 24. 
Thompson [526] proved that for any g, there are only finitely many genus-g groups of 
moonshine type. Cummins [121] has shown that there are in fact exactly 6486 genus-0 
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groups of moonshine type. 616 of these have Hauptmoduls with integer coefficients 
a;(I-), and all of the remainder have qg-coefficients in some cyclotomic field. 


Question 2.2.1. How important are the conditions at the cusps for the definition of 
modular functions or forms? For example, describe all functions f holomorphic on C, 
symmetric with respect to SL2(Z) (i.e. f(y.t) = f(t) for all y € SL: (Z)), but which 
need not be holomorphic or even meromorphic at the cusps (i.e. f may have an essential 
singularity there). 


Question 2.2.2. Show that if f is a modular form of weight k, and 3 doesn’t divide k, 
then f(e?) = 0. 


Question 2.2.3. Suppose f is a modular form, not identically 0, for some I’, with multi- 
plier u and integral weight k. Prove that u must be a one-dimensional representation of 
T. Where does the proof go wrong if k is fractional? 


Question 2.2.4. Prove Poisson summation (2.2.7a). (Hint: x œ> f(x) = Yie fn+x) 
is periodic, so can be Fourier expanded. Compute f (0) in two different ways.) 


Question 2.2.5. By modifying slightly the argument beginning with (2.2.9a), prove 
(2.2.9d) and thus (2.2.8b). 


Question 2.2.6. Let L be any self-dual positive-definite lattice. Then ©z (T) is a polyno- 
mial in 63(t) and ©z,(t) (you can assume this, which is proved for instance in [503]). 
Using this fact, show that the theta function for any self-dual positive-definite lattice of 
dimension < 24 is uniquely determined by the numbers N,, N2 of norm-squared 1- and 
2-vectors. 


Question 2.2.7. Let L be a positive-definite 24-dimensional even self-dual lattice. Prove 
that ©, (t)/n(t)** = J (T) + cz, for some constant cz. Find that constant. 


Question 2.2.8. Find the genus of T (2), using (2.2.16). 


2.3 Further developments 
2.3.1 Dirichlet series 


One of the most remarkable formulae in science is surely 


1 
Of course the right side is the value at s = —1 of the Riemann zeta function (2.2.3c). 


The expressions in (2.2.3c) converge absolutely when Re(s) > 1, where ¢ is then holo- 
morphic, and ¢ has a unique holomorphic extension to all of C, except for a simple pole 
at s = 1 (the harmonic series). Equation (2.3.1) is used in quantum field theory in the 
context of zeta function regularisation (2.2.10); it is related to the g!/*4 in the Dedekind 
eta function (2.2.6b) and the normalisation C /12 in Lie brackets (3.1.5a) of the Virasoro 
algebra. 
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The equality of the infinite sum and product in (2.2.3c) is merely an analytic reformu- 
lation of unique factorisation in Z, but it shows crucially the relation between ¢ (s) and 
the primes. For a trivial example, taking logs of (2.2.3c) quickly gives the divergence of 


2 1/p. 
As important as analytic continuation and the product expansion are, more important 
for us is the functional equation 


A(1 — s) = A(s), (2.3.2) 


where A(s) := n ™PT (s /2) ¢(s), using the Gamma function 


CO 
F(s) := (27) f Beye nd: 
0 
Indeed, Hecke discovered that (2.3.2) is equivalent to modularity (2.2.7c). 


Theorem 2.3.1 (Hecke, 1936) Let f(t) = XLo ane?" and $(s) = oan, 

where |a| < Cn° for some constants d, C, c. Define ®(s) = (21 /d) "T (s) o(s). Then 

the following two statements are equivalent: 

@ AÐ = (Ff FO: 

(ii) O(k — s) = (s), and (s) + $L + $5 is holomorphic and bounded in each 
vertical strip in H. 


Proof: The key idea of the proof is that ®(s) and f(t) are related by the Mellin 
transform: 


(s) = [a (f (ix) — ag) dx, (2.3.3a) 
0 
fGx)-—day = ak x~* O(s) ds, (2.3.3b) 
2i Re(s)=a 


for any constant a > 0 sufficiently large. 

To prove (ii) from (1), write i = h + ta in (2.3.3a), so we get the sum ®(s) = 
Po + Py. Note that a(s) is clearly holomorphic everywhere, and o(s) is holomorphic 
when Re(s) is sufficiently large. Then, using (i), for those s 


en . es f ao 
ous) = f x5 fis) — anya = f x sly Fix) dx — = 
0 1 


= dolk — s) oe 


s k-s 


Therefore ®o(s) extends holomorphically everywhere, except for simple poles at s = 0 
and s = k, and o(s) = ®a(k — s) — aos! — a(k — s)~! holds Y s 4 0, k. Thus 


GH s) = Polk — s) FOES) 
= (e009 eee Ae 2) F (eu ape ae ) = &(s). 
—S S S —S 


k 
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To prove (i) from (ii), shift the vertical contour Re(s) = a > 0 in (2.3.3b) to the left, 
to Re(s) = b < 0, and pick up residues —dp at s = 0 and x—*ag ats = k: 


1 1 . 
f (ix) — agx~* = — x °O(s)ds = — x") &(s) ds 
2701 JRe(s)=b 271 JRe(s)=k—b 
= x* (f/x) — ap). 
Therefore f/x) = xk fix), and (i) follows by analytic continuation. E 


When f is a modular form, we call ¢ the Dirichlet series or L-function corresponding to 
f (the term L-function is usually reserved for those ¢ which also have product expansions 
as in (2.2.3c)). The modular form corresponding to the Riemann zeta function ¢ (s) is 
f@m= $03(T). Theorem 2.3.1 applies with k = 5, d = 2and A(2s) = (s), and relates 
(2.3.2) directly to (2.2.7c). Another famous example, due to Ramanujan, is f = n**. Its ® 
is holomorphic everywhere and its ¢ has a product form [],,(1 — t(p)p™ + pyt, 
where t here is the so-called Ramanujan tau-function (see e.g. (3.4.6)). 

Mysteriously, we can associate Dirichlet series to many of the basic objects of arith- 
metic — modular forms, number fields, algebraic varieties, etc. — in such a way that 
basic operations performed on, and relations between, the Dirichlet series correspond to 
natural operations on, and relations between, the arithmetic objects. In its most general 
form, this is Langlands functoriality. For a famous special case, given an elliptic curve E 
defined over Q, its L-function keeps track of the number of points on E as we vary its field 
of definition from Q to the finite fields. The Taniyama—Shimura Conjecture states that E 
is modular, i.e. that this L-function is the Dirichlet series of a modular form of weight 2. 
As we know, Wiles et al. proved Taniyama—Shimura and hence Fermat’s Last Theorem. 

See [456] for a clear treatment of the material of this subsection. We have been hurried 
since there is at this point no evidence for its direct relevance to Moonshine. There are 
many generalisations of Theorem 2.3.1. Let us mention one. Generators for the groups 
SL,(Z) and Tọ are given in (2.2.1a) and (2.2.5), so Theorem 2.3.1 gives a Dirichlet series 
characterisation for f to be a modular form for those groups. When PT is smaller (say 
T = T(N )), to which Dirichlet series conditions does the modularity of f translate? The 
list of generators is far more complicated. An answer is provided by Weil’s Converse 
Theorem (Section 2.3.3). 


2.3.2 Jacobi forms 
The general quadratic polynomial in one variable x looks like ax? + bx +c, so we 
might try to generalise 63(t) by replacing nt with an?t + bnz + cu. Consider then the 
function 


. 9 - : 
B(T, z, u) = 5 erit” +2rizn+2riu (2.3.4) 
neZ 


where t, z, u € C. We’ve seen these kinds of functions before in (2.1.7a). The 27r1’s in 
front of z and u are conventional. As before, convergence requires t € H. Obviously, 
the u-dependence is rather trivial and is retained only for book-keeping. 
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Fix t € H and u € C, and consider this as a function of z € C. It has period 1 and 
quasi-period T: 

63(t,z + mt + £, u + mz + m?°t/2) = 63(t, z, u), Ym, £ €Z, (2.3.5a) 


and thus is a function living (projectively) on the torus C/(Z + tZ). 

Next, fix z, u € C and consider 3 as a function of t € H. Completing the square tn? + 
2nz=t(n+ 2)? — = and restricting tT, z to the imaginary axis, Poisson summation 
(2.2.7a) and analytic continuation gives us 


1 -1 z z? 
03(T, z, u) = 03 ,—,Uu ; (2.3.5b) 
T T T 2T 


valid for allt € H and z, u € C. 


Definition 2.3.2 [170] By a Jacobi form for SL(Z) of weight k and index m we mean 
a holomorphic function f : H x C > C satisfying 


at +b Z ag +a Omi mcz EXD (2.3.6a) 
f er+d’ct4+d)  “" APM ed f2), nis 


f@,z +t +n) = exp[—27im (er + 2€z)] f(t, z), (2.3.6b) 


for all 6 a) € SL:(Z) and £, n € Z. Moreover, f must have a Fourier expansion of 


the form 


Fam =y A eure OTD, (2.3.6c) 


ne€N reéeZ,r2<4mn 


Similarly, we call 63(7, z, 0) a Jacobi form of weight 5 and index 0 for r'o. The Weierstrass 
p-function p(t, z) in (2.1.6a) is a Jacobi form for SL2(Z) of level 2 and index 1 (Question 
2.3.1). A Jacobi form is a natural blend of the notions of modular form and elliptic 
function: the parameter t € H tells us where on the moduli space of tori we are, and the 
parameter z lives on that torus. Given such classical examples, it is hard to understand 
why their theory was developed only in the 1980s. The introduction of the index m in 
Definition 2.3.2 may be somewhat unexpected, but is explained in Section 2.4.1. 

We can generalise the example (2.3.4) to lattices (and in fact to translates of lattices). 
Let L be an n-dimensional lattice in R”. Define 


OL(T, Zz, u) = X explrit v- v+2miz-v+ 2riu], (2.3.7) 
veL 

where z € C”, u € Candt € H. The z-periods of ©, fill out the dual lattice L*, and the 
z-quasi-periods fill out tL*. Provided L is a rational lattice, we get the obvious analogue 
of (2.2.11c), again from Poisson summation. To make ©; into a Jacobi form for some 
T(N) at weight n and index 0, it suffices to embed z € C into C” along any nonzero dual 

weight vector u* € L*: i.e. Oz (t, zu*, 0) will be a Jacobi form. 
As any string theorist knows, there are several different lattices L, L’ that have the 
same theta function: ©; (t) = Oz (t). Perhaps the most famous example of this is the 
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pair of even self-dual lattices of dimension 16 (namely, D and Eg ® Eg [113]). Actually 
there are lattice examples in every dimension > 3 [108]. However, their Jacobi forms 
are unique in the strongest form possible (see Question 2.3.2). 

Writing theta functions as Jacobi forms is crucial to their interpretation as heat kernels, 
or using Heisenberg groups, as we see in Sections 2.3.4 and 2.4.2. In Theorem 3.2.3 we 
find that the characters of affine Kac-Moody algebras are Jacobi forms of weight and 
index 0. Indeed, they are rational functions of lattice Jacobi forms (2.3.7). 

An obvious question to ask is, to any modular form f(t), is there a Jacobi form 
f(t, z) for the same group and at the same weight such that f(t, 0) = f(t, z)? And 
if so, is this Jacobi form unique? It turns out that every weight-k modular form f, at 
least for SL2(Z), can be lifted to a Jacobi form for the same weight and group, at index 
m = 1. This Jacobi form is far from unique, even at m = 1. In fact, the redundancy has 
the same dimension as the space of weight-k + 2 cusp forms for SL2(Z). This fact is a 
consequence of theorem 3.5 in [170]. 


2.3.3 Twisted #2: shifts and twists 
Recall the classical Jacobi theta functions 6; = 0, 1 = 0, o 83 = 90,0, 04 = bo, 1> 
using the notation of (2.1.7a). These obey simple modular transformation rules, most 
concisely stated in vector notation as 


(A grii 0 0 0 (A 
02 o| 0 e 4 0o OF | & 
6; (t+1,z)= 0 0 0 1 6; (T, Z), (2.3.84) 
04 0 0 1 0 04 
6; 1 0 0 0\ /6 
APER en fE 000 1), 
— DIET, «poe : 2.3.8b 
03 (=.2) e ilow tolla A. E79 
04 0 1 0 0 04 


That is, these 0; define a vector-valued Jacobi form for SL3(Z) (Definition 2.2.2). The q- 
expansions of 6, and 64 have negative coefficients; we can make ‘positive’ combinations 
of these theta functions that have almost as nice transformations under SL2(Z): 


63(T, Z) + O4(T, z) 


a A T E 
(2.3.9a) 
Ory(t, z) = METS) = qr PA tgr? tgr? tgr), 
(2.3.9b) 
balt, z) = BTD- AD E ee ee oe EE 
(2.3.9) 
O3\(T, z) = (T, 2) — 01,2) = gree + qr + or + 4°r3 rey 


2 
(2.3.9d) 
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wherer = e?”” , Note that 0y; has the geometric interpretation as the theta series (2.2.1 1a) 
of the translate 2Z + i, 

We regard 01, 02, 04 as Z2-twists and -shifts of 63. More generally, the parameter 
re xZ in 6,., corresponds to a Zy-shift, and s € xZ to a Zy-twist. A far-reaching 
generalisation of this simple construction is studied in Section 5.3.6; the analogue there 
of the positive combinations (2.3.9) is the characters for a vertex operator algebra. In 
Monstrous Moonshine the twists of J (t) are the McKay—Thompson series J,(t), and its 
more general shifts and twists are the Norton series of Maxi-Moonshine (Section 7.3.2). 
Physically, this corresponds to the orbifold construction (Section 4.3.4). There, the pos- 
itive linear combinations have the direct interpretation as graded dimensions of sectors 
of the conformal field theory. 

As always, the clearest example is provided by lattices (Section 1.2.1). Let L be an 
integral positive-definite lattice and let r, s be two vectors in Q ® L. As in (2.1.7a), write 


Ors(T, z= 5 eTit (+r) (x4 T) mie +s)-(2x tr) (2.3.10a) 


xeL 


where as before z € C & L. Then Oz; will be a Jacobi form for some subgroup of 
SL,(Z), as is (2.3.7). In fact, if L is even and self-dual, we can be much more explicit. 


For any r,s € Q Q L, and any F A € SL2(Z), we have 


© at +b Z ( + dy"! . CZ © ( ) 
zs | —— = (CT exp | 71 sar+cs,br+ds\T, Z), 
ers N cr 4d’ ctt+d P Gear ve 
(2.3.10b) 


where n is the dimension of L. 

As usual, certain positive combinations of these ©;.,,,; have a direct (geometric) inter- 
pretation. Again let L be self-dual, and suppose the vector s € Q @ L has order m in L 
(so ms € L). Then there will be a vector s’ € L such that s- s’ = Ł (mod 1). For any 
integer k, and vector r € Q ® L, we get this generalisation of (2.3.9): 


m—1 
k 
— ) exp [rij (s -+r — ~)| Oxy, js = Oxotr+ks’s (2.3.10c) 
m m 


the theta series of a translate of the lattice Lp = {v € L |v- s € Z}. 

In the orbifold construction of vertex operator algebras and chiral conformal field 
theory, the role of vectors r, s is played by automorphisms g, h in some group G, and 
the role of the sublattice Lo in (2.3.10c) is played by the vertex operator subalgebra V 
fixed by G. However, as we see in Section 4.3, full conformal field theory or string theory 
involves the interplay of two vertex operator algebras; the orbifold construction there 
involves in addition a reconstruction of a new full conformal field theory from V°. We 
address this further in Sections 4.3.4 and 5.3.6. 

This reconstruction is again beautifully illustrated by lattices. Let L be any rational 
lattice and T = {t;} be a finite set of vectors in Q & L. Then by L{T} we mean the set 


L{T} = x+ > bit; | bi eZ, xeL, (s+ De] tj ezvil. 
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Then L{T} is a lattice rationally equivalent to L (i.e. there is an orthogonal transfor- 
mation T : Q 8 L{T} —> Q@L). Conversely, if Lı and L3 are rationally equivalent 
integral lattices, then there is a finite set T = {t,,...,tm} C Q Q Lı such that L{T} is 
isomorphic to L2 [238]. Clearly the theta series of L{T} is the average of © ,.,5 for a 
finite number ofr, s in the Z-span of T . The important special case is when L is self-dual; 
then L{T } will also be self-dual provided all t; - t; € Z. In this case, 


En = (J (z +) en), 
GEZ i 
where Ly = {x € L |x -ti € Z}. Call two self-dual lattices L4, La neighbours if there is 
some vector ¢ with integer length-square t - t such that 2t € L1, and L3 and L; {t} are iso- 
morphic. Then any two self-dual lattices, with equal dimensions n, + n_ and signature 
n, — n_, will be neighbours of neighbours of - -- of neighbours of each other [238]. 
Another way to collect some of these results is through Dirichlet characters, which are 
important in the classical theory of modular forms. A Dirichlet character is a function 
X : Z —> C, with some period N, such that x(a) Æ 0 iff a is coprime to N, and for 
alla,b € Z x(ab) = x(a)x (0). Dirichlet introduced these x in his proof that there are 
infinitely many primes in any arithmetic series a, a + b, a + 2b,..., provided only that 
a and b are coprime (clearly a necessary condition). He proved this by twisting the 
Riemann zeta function (2.2.3c) by x: 


[o0] 
L(x s) = X x@)n* =| [d-x@pry'. (2.3.11) 
i=1 p 
Given the lesson of Section 2.3.1, it should also be interesting to Dirichlet-twist modular 
forms. 
Modular forms and functions for the principal congruence subgroup T(N) can be 
defined as in Definitions 2.2.1 and 0.1, except now there are several orbits of cusps, and 


1 
we have invariance under only ( 0 ), so the g-expansion takes the form 


1 
f(t) = Y ape TIN = Saag”: (2.3.12) 
neZ neZ 
Given any Dirichlet character x, we can twist this function f and obtain 
LOS) xmag. (2.3.13) 
neZ 


Then if f is a modular form for T(N), fy will be a modular form of the same weight for 
some I (M). It isn’t very deep that modularity should be preserved — see Question 2.3.4 
for one such argument. Theorem 14 in [456] provides a generalisation. The Dirichlet 
twist takes on a clear algebraic significance in the context of automorphic representations 
(Section 2.4.1). 

A deeper use of Dirichlet twists is Weil’s Converse Theorem (see e.g. theorem 17 of 
[456] or page 64 of [90]), which characterises modular forms for T(N) by generalising 
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Theorem 2.3.1, using infinitely many Dirichlet twists. It is a ‘converse’ in that it gener- 
alises the converse of (i) > (ii). Applications of this are given in sections 1.9 and 1.10 
of [89]. 

A more surprising example of twisting is by Galois automorphisms. Let Fy be the 
space (in fact field) of all modular functions for F(N), with g-expansion as in (2.3.12), 
where each coefficient a; lies in the cyclotomic field Q[éy ] (recall Section 1.7.3). This 
field Fy is explicitly constructed in section 6.2 of [505]. Clearly, j(t) lies in each Fy. It 
can be shown that Fy is a Galois extension over Q(/), with Galois group 


Gal(Fy /QUj(t))) = GLa(Zy)/{£1} (2.3.14a) 

(see Section 1.7.2 for definitions). For any matrix A €GL,(Zy), we can find an 
)) € SL,(Z) such that 

A=B o J (mod N). Then the action of A € GL2(Zy) on a modular function f(T) 


integer £ € Z (namely £ = det(A)) and a matrix B = (< 


is given by 
A.f(z) = (ef)( =** (2.3.14b) 
fí) = 3; 
g br+d/)’ 
or) ang” =X olan) g”, (2.3.140) 
neZ neZ 


where o¢ € Gal(Q[én]/Q) sends £y to Ex. This Galois action plays a technical but 
important role in both Moonshine (e.g. Question 7.3.3) and rational conformal field 
theory (e.g. Section 6.1); see Section 6.3.3 for some speculation. 


2.3.4 The remarkable heat kernel 


Various topological proofs of modularity, inspired by conformal field theory, have arisen 
in recent years. For instance [24], [203] and section 6 of [502] all provide proofs for 
n(t). These suggest the thought that, more generally, modularity — hence Moonshine — 
may be a topological effect (Section 7.2.4). The oldest and perhaps most fundamental 
observation along these lines is the relation between theta function modularity and the 
heat kernel. 

Fourier determined that the rate of flow of heat energy in a material is proportional 
to the gradient of the temperature, and thus wrote down the diffusion or heat equation, 
which in one dimension looks like 


2 


ð 1a 
leo = gaga Yr ER, We >0 (2.3.15a) 


(the harmless normalisation 1/4zr is introduced for later convenience). Suppose that 
the initial distribution of heat in the infinite rod is f(x) = lim,;—.9u(t, x). Then Fourier 
analysis tells us how to find a solution u(t, x) for all times t. Letting 


~~ 1 = —ia F 1 z ~ig 
u(t, œ) = = u(t, yje”? dy, f(@= =| foye™ dy, 
oo —0o 
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the equation to be solved has been transformed to d7/dt = —ai/4z, with initial con- 
dition f, which has the solution U(t, œ) = f (a) e72"! /4t We can now find u by using 
the inverse transform: 


1 Coins 5 oe P 
u(t, x) = — f eiax—art/an f fO)e™® dy da. 
—0o 


PA J 35 


But 


oo 
x oie 1/47 doy = poe t/t =: K(t, z). 
T J- 


Thus u(t, x) is given by the convolution 


u(t, x) = f K(t,x — y) f(y) dy. (2.3.15b) 


We see that K (t, x) is itself a solution to the heat equation, with initial condition f(x) = 
5(x), the Dirac delta. Physically, K corresponds to an infinitely hot spot placed at position 
x = 0 at time ¢ = 0, on an otherwise uniform, infinitely long rod. This fundamental 
solution K (t, x) is called the heat kernel or propagator for R. 

What has this to do with the theta function? Consider the specialisation 63(it, x), where 
t,x € R,t > 0. Note that 


so 63 is a solution to the heat equation. Also, in the t — 0 limit, 43(0, x) becomes the 
distribution eea d(x — n) (this is proved by evaluating lim;_,9 i 03(it, x) f(x) dx, 
but is merely the statement that )°, e = }_„ ô(x — m)). Thus 63 plays the same role 
on the circle R/Z that K (t, x) played on the line R: 63 is the heat kernel for the circle. 
But we can obtain this kernel in another way, by averaging the heat kernel K (t, x) for 


R: 


2rimx 


Ms 
y tPF emt = p127?) 0, G z) , 
i 


n=—0o 
Equating this to 63(it, x) recovers (2.3.15b). 
As with Poisson summation, the notion of heat kernel can be generalised considerably. 
For example, let M be a compact n-dimensional Riemannian manifold and let A be the 
Laplacian. In local coordinates, 


Da ye a? 
A =— 1J PEF T 
(x) 28 Oras 
where g (x) is the metric. The heat equation on M is 
f] 
go =—Au(t,x), x EM, t>0, 


with initial condition f(x) = lim;_,ou(t, x). This can be solved formally by the expres- 
sionu(t, x) = e~' f (x). In fact e~"4 makes sense as an operator on L?(M), for anyt EC 
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with Re(t) > 0. By the heat kernel K (t, x, y) for M we mean as before the solution to 
the heat equation with initial condition 5(x, y), or equivalently K (t, x, y) generates the 
solution (e~'4 f)(x) = J m K(t, x, y) f(y) dy to the heat equation with arbitrary initial 
condition f. The heat kernel always exists and is unique, and is analytic for t > 0. In 
fact the heat kernel can be expressed as 


K(t,x,y)= Doe" gn) bn), 


where i, > 0 are the (discrete) eigenvalues of the Laplacian A with (orthonormal) 
eigenfunctions ¢, € C®(M) C L?(M). Incidentally, K is the kernel of the operator 


e~' in the sense of the Schwartz kernel theorem. For t small, 


CO 
K(t,x, y) = Arty eter N t fica, y) 
i=0 


where d(x, y) is the distance between x, y € M, and f; are certain functions. In the 
language of quantum field theory, the heat kernel K (t, x, y) equals (x|e~‘“|y). The heat 
kernel stores geometric information on M , and interpolates between the identity operator 
of L?(M) at t = 0 and the projection onto the kernel of A as £ > oo. 

For example, for M = R” the heat kernel is K(t,x, y)= (4rt) exp[—|x — 
y|?/4t], so for any n-dimensional lattice L C R” the heat kernel of the n-torus R” /L is 


(4t) X expl-lx — y — v|? /4r]. 


veL 


But it also equals (normalising the arguments appropriately) To © +, and so we recover 
the modularity of (2.3.7). 

The natural generalisation of the M = R” calculation is performed by [231]. In partic- 
ular, let G be a connected, noncompact reductive Lie group, let K be a maximal compact 
subgroup, and let I be a discrete subgroup of G such that the quotient '\G is compact. 
Then two expressions for the heat kernel, and its trace, on the space [\G/K are obtained. 
In the special case of G = R” and I being a lattice, the trace formula reduces to the 
usual formula expressing ©; (—1/t). The naturality of this construction [\G/K will be 
clear after reading Section 2.4.1. Moreover, [181] proves the Macdonald identities using 
the heat equation on compact Lie groups. 

Further generalisations are possible (see e.g. [52]). For example, degree 1 and 0 terms 
can be added to the Laplacian A, and we can consider more generally differential opera- 
tors on sections of line bundles over M, rather than on M . Heat kernel techniques can be 
used to prove various formulations of the Atiyah—Singer Index Theorem, and equivariant 
analogues of the theory yield the Atiyah—Bott fixed-point theorem. The strategy typically 
followed by these applications is to consider the integral I(t) = f m KE, fO), x) dy for 
some map f : M — N, where K is the heat kernel on N. The t — O limit collapses the 
integral to an integral or sum over f~!(x). But a global expression for I(t) can often 
be found, for example using representation theory or geometry; taking its £ — O limit 


150 Modular stuff 


yields an identity between the local integral f fH) and some global data of M and N. 
See, for example, [389]. Question 2.1.7 is essentially an example of this strategy — what 
we call K (g, h) there is the heat kernel at t = 0 of the finite group G. 

Some of the many applications and occurrences of the heat kernel are collected in 
[320]. But can the heat kernel be directly relevant to Moonshine? This seems very 
possible. After all, the Atiyah—Bott fixed-point theorem yields an elegant proof of the 
Wey] character formula for compact Lie groups. In the conformal field theories associated 
with Lie groups (namely, the Wess—Zumino—Witten models), the heat kernel is used to 
explicitly construct the flat Knizhnik—Zamolodchikov connection on spaces of chiral 
blocks [288] (more on this starting in Section 3.2.4). This is significant because, according 
to conformal field theory, it is the monodromy of the Knizhnik—Zamolodchikov equation 
that is responsible (in genus 1) for the modularity of the affine algebra characters. 

To this author’s knowledge, heat kernel methods have never been used directly in the 
context of Monstrous Moonshine, but surely they can be used to prove at minimum the 
modularity of the McKay—Thompson series, and to help us understand a little better 
the geometry of Monstrous Moonshine. It seems possible that equivariant heat kernel 
methods could provide a geometric umbrella under which herd the more interesting 
examples of Moonshine. 


2.3.5 Siegel forms 


Vaughn Jones considered how one von Neumann algebra can be embedded in another 
(e.g. itself), and the result — subfactor theory — is profoundly interesting. This success 
suggests the following analogue of Galois theory: 


The Jones Programme Study the ways in which one infinite beast can be embedded 
in another. 


Let’s probe this thought with the simplest infinite beast this author can think of: lattices 
(Section 1.2.1). Let L C R”, L’ C R” be lattices of dimension n and n’, respectively. Fix 
bases (x, ... , x}, fy, ..., y} and construct then x n matrix M, whose columns 
are the x“), An embedding of L’ into L is a linear map g : L’ > L that preserves all 
inner-products. It is determined by the values g(y) = >, gji.x. The coefficients 9 ;; 
all lie in Z and form an n’ x n matrix (vy). Now, @ preserves all inner-products, iff 
e(y) ; e(y) = y F yh Vi, j, iff 


(yg) M'M (g)' = M'M. (2.3.16) 


Let N(L’, L) be the number of these embeddings, i.e. the number of n’ x n Z-matrices 
(o) satisfying (2.3.16). This number will be 0 unless n’ < n. 

For example, N (Z, L) equals the number of unit vectors in L. Thus, if L is integral, 
the generating function ae N (SkZ, L) x* is the theta function ©, (T), for x = e7"”. 
We might hope that the numbers N(L’, L) are coefficients of some other modular-like 
function. 
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Construct a multi-variable generating function as follows. Fix an n-dimensional inte- 
gral lattice L. Let x;;, 1 < i, j < n, be variables. Consider 


_< N(L’, L) BB; 
Th, (xij) := DB 5 YA aT I] aes (2.3.17a) 
n'=0 [L’] ZB,- Bn} =L" 1si,j<n 


The sum over [L’] is of all isomorphism classes of n’-dimensional even lattices. For each 
of these classes, fix a representative L’ C R”. The {8;} run over all possible ordered 
n-tuples of lattice vectors that span L’. There is an equivalent but cleaner way to write 
(2.3.17a). Let A, be the set of all n x n positive semidefinite matrices A with integer 
entries and even integers down the diagonal. These are precisely the matrices A;; = 
Bi 2 B je Then 


Tu= >> NL) || Peo (2.3.17b) 
A'EAn 1<i,j<n 
where L’ is any lattice realising the matrix A’ of inner-products. 

In any case, this generating function Th, , after making the change-of-variables x;; = 
e™'Ti, is a Siegel modular form! We return to it shortly. 

Let’s try to find a version of modular forms where H is replaced by a higher- 
dimensional space. Start with ©,(t,z) in equation (2.3.7), but reinterpret this as a 
function of the complex matrix T := tA, with entries Aj; = b® - bO?) for a basis b® 
of the lattice L. We thus get 


O(T, z):= > exp[zin -Tn +27 in- z]. (2.3.18) 
neZ” 
How far can we extend the domain T? We may as well restrict to symmetric matrices T. 
For which symmetric matrices T does (2.3.18) converge to a holomorphic function? We 
know from (2.3.7) that it does whenever T = xA + iA for any positive-definite matrix A 
and real number x, but there is no need to restrict to such T . Indeed, it is straightforward 
to obtain that (2.3.18) converges to a holomorphic function for any z € C” and any Tin 
the Siegel upper half-space HL, defined in Section 2.1.4. 
Of course, (2.3.18) is quasi-periodic in the z variable: 


O(T,z +m) = OCT, z), Ym € Z” (2.3.19a) 
O(T,z + Tm) = exp[—zim -Tm — 2rim -z] OCT, z), Ym € Z”. (2.3.19b) 
The Siegel theta function @(T, z) is an easy generalisation of the Jacobi theta function 
(2.3.7). What makes it so remarkable is its symmetries as a function of T: 
@(AT + BXCT + DY', (CT + D)'z) 
= &, det(CT + D)? exp[ziz-(CT + DY !2)]O(T, z) (2.3.20) 


A B 
for all y = ( c al € Spa, (Z) for which all diagonal entries of A'C and B'D are 


even. Call this subgroup Ij, in analogy with (2.2.5). The numbers é, € C are certain 
eighth roots of unity. 
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We defined Sp>,(Z) in Section 2.1.4. The modularity of @(T, z) is proved much the 
way that modularity of 6; was proved. The analogue of (2.2.1a) is 


I A B 0 0 -I 
saD =( (4 eG Pane 0 ) IVA € Maxn(Z), 


A=A',VBe oL); (2.3.21) 


If we insist the matrices A in (2.3.21) have even diagonals, then we generate 4. Ver- 


I A B 
ifying invariance of ©(T, z) under ( 01 ) and 0 = is routine; use Poisson 
summation for : p . The argument is given in detail in chapter 2.5 of [439]. 


Section 2.1.4 relates H,„ to Riemann surfaces of genus n. As we recall, the possible 
period matrices Q of a given surface form an Spo,(Z)-orbit in H,. The Jacobian of 
the surface is C” /(Z” + QZ"). Quasi-periodicity (2.3.19) embeds these Jacobians into 
projective space. Most points in H, (at least forn > 2) aren’t period matrices of surfaces, 
and as we recall the moduli space Jt, 9 can be identified with €, /Sp>,,(Z) for some subset 
¢, in H}. 

We should thus regard @(T, z), Sp2,(Z) and H, as the genus n versions of 03, SL2(Z) = 
Sp,(Z) and H, where t becomes ann x n matrix. The hyperbolic geometry of H becomes 
symplectic geometry on H, (see e.g. section 4 of [395]). As mentioned in footnote 1 of 
this chapter, the future will find Moonshine expanding into higher genus. The calculations 
will be far more complicated, and this is presumably the reason for the delay. One of the 
only explicit works in this direction is [533], which looks at the lattice <> theta function 
example of Figure 0.1 (or equivalently the bosonic string compactified on a torus) at 
genus 2. As expected, Siegel modular forms play a dominant role. See also [9] for 
some calculations with multi-loop heterotic strings, which heavily involve Siegel theta 
functions. 


Definition 2.3.3 Let T C Sp»,(Z) (n > 1) have finite index. Then a Siegel modular 
form of weight k and level I is a holomorphic function f on H, such that 

f(AT + BXCT + DY!) = det(CT + D¥ f(T), v(é a Er. 
A growth condition at the cusps (requiring holomorphicity) is automatically satisfied 
whenzn > 1. Another simplification of higher genus is that any subgroup I’ C Sp», (Z) of 
finite index includes some congruence group I""(NV) := {A € Spa, (Z) | A = I (mod N)} 
with finite index. 

For example, O(T, z} is a modular form of weight 1 and level T”(4). Eisenstein 
series for Sp2,(Z) can be defined in the obvious way, as a sum of det(CT + D)~** over 
appropriately defined pairs {C, D} of matrices (see e.g. section 14 of [395] for details). 
A final example plays the same role for @(7, z) that ©; (t) played for @3(7): let L be 
any m-dimensional rational lattice and let A be its Gram matrix, then 


O_L(T, Z):= 5 exp[zitr(N‘T NA) + 2ritr(N'Z)], 
N 
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where T € H,, Z isann x m complex matrix, and the sum is over alln x m Z-matrices. 
This is a specialisation of © for SP2nm(Z), and is a Siegel modular form of weight m /2 
for some I"”(M) (see e.g. chapter 2.6 of [439]). We met ©; in (2.3.17). 

Finally, let us describe the analogue of Fourier expansion here. For convenience take T 
to be Sp2,,(Z). Then a modular form f for T obeys the periodicity f(T + B) = f(T) for 
alln x n Z-matrices B. Together with holomorphicity, this means f has an expansion 


fT) = $ aM) exp[2ri tr(T M)], (2.3.22) 
M>0 
where the sum is over all positive-semidefinite symmetric n x n matrices M with entries 
Mi; € Zand Mi; € iZ. These numbers a(M) play the role of Fourier coefficients here. 
For example, (2.3.17b) gives the Fourier expansion of ©; (T). 


Question 2.3.1. Prove that the Weierstrass p function (2.1.6a) is a Jacobi form for SL? (Z) 
with weight k = 2 and index m = 1. 


Question 2.3.2. Let L, L’ be two n-dimensional rational lattices in R”, and let u, u’ € R” 
be vectors of finite order for L and L’, respectively. 

(a) Prove: If Or 4,,(t, Z) = Oz ay(t, Z) for all t € H, z € C”, then L +u = L’ + u' as 
sets. 

(b) Prove that L and L’ are isomorphic (Section 1.2.1) iff there exists an orthogonal map 
T € O,(R) such that Oz (t, z) = Oz: (t, Tz) for all t € H, z € C”. 


Question 2.3.3. Let L be any integral lattice of dimension n. Foreachm = 0, 1, 2,..., let 
Lm) denote all the vectors u € L with norm-squared u - u = m. Each automorphism w of 
L permutes the vectors in Lm), so for each m we get a || Lem) ||-dimensional representation 
Om) Of Aut(L) by permutation matrices. Thus, for each w € Aut(L), we can twist O; as 
follows: define 


oP) := 5 Xm (w) exp[zitm], 


m=0 


where Xm) is the character of the representation a). For example, O = ©, and 
et (r) = 1. Prove that, for each œ € Aut(L), oP will be a modular form for some 
T(N) and some weight 0 < k < n/2, and that k = n/2 iff w = id. 


Question 2.3.4. Let f be a modular form of weight k, for some T(N). 
(a) Prove that, for each choice of r € Q, the function g(t) := f(t +r) is a modular 
form of level k, for some (MV) (M depending onr). 


1 
(b) For any field F, prove that SL2(F) is generated by the matrices ( 0 j } forr €F, 


—1 
together with ( : 0 ) . From this, prove that if f is amodular form for (NV) of weight 


b 
d E€ SL2(Q), the function h(t) := f (E) will be a modular form 
of weight k for some F (M) (M depending on a, b, c, d). 


k, then for any ( y 
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2.4 Representations and modular forms 


According to I. M. Gel’ fand, mathematics of any kind is representation theory. 
This section applies this beautiful strategy to modular forms. 


There are at least formal similarities between quantum theory and modular forms. Wigner 
taught that a particle should be identified with a unitary representation of SL,(C) or 
SL,(R), in (3 + 1)- or (2 + 1)-dimensional space-time, respectively. In this section we 
associate modular forms to unitary representations of SL2(R), and the picture generalises 
naturally to, for example, SL(C). Could there be some cross-fertilisation between the 
methods and ideas of quantum field theory and modular forms? 

In the 1962 International Congress of Mathematicians, I. M. Gel’fand remarked some- 
what cryptically that there is an intriguing analogy between the scattering matrix of quan- 
tum mechanics and zeta functions. Ten years later the idea was exploited and clarified 
by Faddeev and Pavlov, who applied the Lax—Phillips scattering theory to the theory of 
automorphic forms. For example, poles of the scattering matrix (which in quantum field 
theory would correspond to particles) correspond to zeros of the Riemann zeta function. 
Their work is generalised in [371], where we find for instance a new proof of the Selberg 
Trace Formula for SL2. These applications are significant, and hopefully a small hint of 
things to come. See also [562]. 


2.4.1 Automorphic forms 


Definitions 0.1 and 2.2.1 of modular functions and forms for SL2(Z) should seem very 
arbitrary. In mathematics we attack arbitrariness through generalisation. A good gener- 
alisation helps us to see the meaning of each feature, and puts the whole theory into a 
broader perspective. Of course we can generalise these definitions by replacing SL2(Z) 
with other Fuchsian groups I’ < SL2(R), but this is too obvious to be helpful. 

Much more valuable is to understand the relation between H and G = SL,(R). In 
particular, an easy calculation shows that our action of G on H is transitive. That is, 
any point in H can get mapped to any other point in H by a matrix in G. In particular, 
Yx+iy = VY Ay 

Í 0 I//y 
Moreover, the subgroup of G fixing i € H, say, is K = SO,(R). Thus 


) € G sends i to x + iy. We call H a homogeneous space for G. 


H = SL:(R)/SO:(R) = G/K. (2.4.1a) 
More precisely, we have the Iwasawa decomposition 
ab\ ipfy x cos@ sin @ 
6 1) ie CG 1 —sin@ cos)’ ey) 
; ai+b  » d-—ic 
= ser ; 2.4.1 
as ae ae lo 


In fact SO2(R) is the unique (up to conjugation) maximal compact subgroup of G. 


8 See the quotation on page 840 of Proc. ICM (American Mathematical Society, Providence 1987), edited by 
A. M. Gleason. 
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In mathematics we try to find hidden structure, and that is the spirit in which (2.4. 1a) 
should be read. The key here was the transitive action: an expression like (2.4. 1a) arises 
whenever one has a homogeneous space. Note that the action y.t of G on H now reduces 
to matrix multiplication: yy, K. 

Do modular forms respect (2.4.1a)? Can we lift modular forms f : H — C into func- 
tions @¢ : G —> C? Yes, and in fact we gain something in the process. Use (2.4.1b): 


é ic J ai (53) Ghd S EE A (2.4.2a) 


d cit+d 
where k is the weight of f. Then for any A € SL2(Z) and œ € R, we get 


a b cosa sing 7 a b\ iz 
o(a(‘ fo) ae a) Se aye . (2.4.2b) 


The point of multiplication by (ci + d)~* is now clear: it makes ¢ s left-invariant with 
respect to SL2(Z) = Tr. Thus we’ve sacrificed K -invariance and I’-covariance, for K - 
covariance and T -invariance. This is significant, because compact Lie groups like K are 
much easier to handle than infinite discrete groups like SL2(Z). 

In particular, we find that the right multiplication in (2.4.2b) defines a one-dimensional 
representation of K on Coy. We know that the finite-dimensional irreducible K- 
representations are parametrised by a nonnegative integer, and all are one-dimensional. 
Thus we get an algebraic interpretation for the parameter k in Definition 2.2.1: it is the 
highest weight of a representation of the maximal compact subgroup SO (R) of SL? (R). 

We also get a representation of SL2(R) on the left side, given by f > dy o yo 
The vector space here is the infinite-dimensional function space given by the C-span of 
the SL,(R)-orbit of øs. The result is an irreducible representation of SL? (R), which is 
constant on = SL,(Z). This representation is unitary — in fact it is a subrepresentation 
of the regular representation of G on the Hilbert space L?(T\G). 

As an aside, note that everything generalises very naturally to Siegel modular 
forms. There, G is Sp2,(IR), T is Sp2,(Z) or a similar discrete group like "7, and 
K = SO>,(R) N Sp,,,(R) = U, (C). Once again, H,, = G/K. For Jacobi forms, G is a 
semi-direct product of SL2(R) with the Heisenberg group (it is constructed next subsec- 
tion), and K is SO2(R) x S!: once again G/K = H x C, as it should. The weight k and 
index m in Definition 2.3.2 parametrise the irreducible one-dimensional representations 
of SO2(R) and S!, that is to say K . Thus the index of a Jacobi form has a natural algebraic 
interpretation, as it should. 

So the generalisation of modular forms and functions is starting to be clearer. We are 
looking for functions on the space ['\G, for discrete subgroups F of real Lie groups 
G, and we should study them via the representation of G they generate. The relation 
between modular forms and representation theory was accomplished in the 1950s by 
Gel’fand and Fomin. Let’s make it more precise. 

The unitary irreducible representations of G = SL2(R) were classified by Bargmann 
[44]. His motivation was physics (the Lorentz group). Of course there is the one- 
dimensional identity representation. The remaining irreducible unitary representations 
are all infinite-dimensional, and fall into three series: the principal series P= for s € R, 
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the complementary series C, for 0 < s < 1, and the discrete series DF for n = 2,3,... 
In addition, G has many irreducible non-unitary representations. See, for example, chap- 
ter 1.3 of [243] for explicit realisations of all the unitary representations. For example, 
the discrete series D* consists of holomorphic functions f on H, with Peterssen Her- 
mitian form (f, g) = fi f(t)g(t)y”!dx dy, and action f œ> (—ct +a)” f (ZŁ). 
Obviously our G -representation associated with ® ¢ is isomorphic to Df. What f’s come 
from the other G-representations? 

Associated with the principal series are functions such as this analogue of the Eisen- 


stein series, called a Maass form: 


ie 

EQ,s)= 2D aa sEeC. 
This may look less strange when one considers the formula Im(y.t) = y/|ct + d|*. 
For fixed t € H, the Maass form is absolutely convergent for Re(s) > 1 and has a 
meromorphic extension to all s € C. For fixed s € C, it is invariant under SL2(Z). It is 
not a holomorphic function of t, and so cannot be a modular form in the usual sense, but 
holomorphicity in Definitions 0.1 and 2.2.1 is a feature we must be prepared to lose, since 
most real Lie groups G aren’t complex manifolds. In fact we lost the holomorphicity of 
f when we wrote (2.4.2a). What takes its place? 

What is holomorphicity, other than the solution to differential equations (the Cauchy— 
Riemann equations, or the Laplacian + i on R*)? The Maass forms aren’t holo- 
morphic, but they are eigenfunctions of the Laplacian on H, namely — y? ea + 2). By 
the Laplacian on H we mean a second-order differential operator that is invariant under 
all isometries SL2 (R). 

We are thus led to the role of differential operators. These can be understood as follows. 
Whenever we have a Lie group representation, we also get an associated action of the 
Lie algebra (the derived module of Section 1.5.5). The Lie algebra will typically act as 


first-order differential operators; on L?(G) it acts by Lie derivatives. More precisely, to 


1 
X € sh(R) we get the action f(g) => 4 F(ge!*)|,-0. For example, ( X J € sh(R) 
corresponds to 5, using the parametrisation of (2.4.1a). An action of sh (R) implies an 
action of the universal enveloping algebra U (sh (R)), in our case simply by composing 
differential operators to get ones of higher order. As always, the centre Z(U (sh (R))) 


naturally plays a fundamental role. Here, it is generated by the second-order operator 


y 32 n 32 A 32 
əx? ay? dx 30` 
This is how the Laplacian arises, algebraically. By definition, it commutes with all oper- 
ators, so studying its eigenspaces helps decompose L?(T\G) — we used a similar idea 


in decomposing Lie algebra modules into weight-spaces. Understanding that decompo- 
sition is essentially equivalent to understanding the space of modular forms for I’, and 
can be called the harmonic analysis of automorphic forms. 
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We have only scratched the surface, but this discussion and the following definition 
should give the reader a glimpse of the resulting theory. 


Definition 2.4.1 Let T be a discrete subgroup of a real semi-simple Lie group G, and 
let K be a maximal compact subgroup of G. Let x be a one-dimensional representation 
of K. We call a smooth function f : G — C an automorphic form for I if: 

(i) f(vgk) = x) f(g) forally €T.g €G.kEK; 

(ii) f is an eigenfunction of every operator in Z(U (g)); 
(iii) f obeys a certain growth condition. 


The term ‘automorphic form’ (going back to Klein in 1890) is much older than this 
definition. Here, g is the Lie algebra of G and Z(U(qg)) is the centre of its universal 
enveloping algebra, which will be isomorphic to a polynomial algebra in r variables, 
where r is the rank of g. As mentioned above, the differential equations in (ii) take 
the place of holomorphicity. The growth condition is too technical to give here, but 
for SL2(Z) it reduces to holomorphicity at the cusps. For more on the relation between 
automorphic forms and representations, see, for example, [89]. 

All modern material on automorphic functions uses the language of adéles and idèles,’ 
which unify and simplify the theory (at the expense of making it more abstract). How- 
ever, since they have no role in the remaining material of this book, we only sketch 
their motivation, and remain true here to the spirit of this not-completely-self-contained 
subsection. 

Projective or inverse limits are the way algebra ‘integrates’ an infinite tower of struc- 
tures into a single structure. A classic — and relevant — example is divisibility by powers 
of primes. We say that a given integer n is divisible by p“ if the canonical projection 
Z — Zp: (‘reduce mod p°’) sends n to 0. Now, the rings Zp- and Z,» are related by a 
homomorphism Zpa —> Zp», provided a > b. So we get a tower 


+> Z/pZ—> Z/p’Z > Z/pZ > 0. 


The corresponding integrated structure is the projective limit lim —Z« =: Z p» the p-adic 
integers, which can be realised as formal power series °°) anp”, a; € Z/pZ. Doing 
arithmetic on them amounts to treating all Z/p*Z simultaneously — in this sense it is the 
integration of all Z,a. For example, 


V2=3+1:-7+2:-P+6-P+- 


in Zy. The p-adic rationals Q p are the field of fractions of Z p» Or equivalently the formal 
Laurent series $% _y a;p', pi € Z/pZ. They are to the ordinary rationals much as 
R=: Ovo is: acompletion, on which calculus can be defined. For a readable introduction 
to the p-adics, see [257]. Projective limits play a huge role in Section 6.3.3. 

The more intuitive notion of limit, namely the injective or direct limit, arises when all 
arrows are reversed (i.e. when we have a sequence of embeddings rather than projections), 


9 Idéles were introduced by Chevalley in 1935 to remove some of the analysis being used with L-functions, 
etc. The word comes from ‘ideal’. Adéles were introduced in 1945 as an additive version of idéles. 
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and is the algebraic analogue of taking derivatives. The prototypical example is the 
space of smooth functions Fy(U) on an open patch of a manifold M: the direct limit 
lim, Fy(U), as U — {p}, is isomorphic to the space of germs at p. 

The modern theory of automorphic forms collects together the Q p into the additive 
group of adèles A and multiplicative group of idèles A*. The adèles are defined to 
be the group of all sequences (Xæ, ¥2, X3, X5, ..., Xp, ---), where xœ E R, xp € Qp, 
and for all but finitely many p, xp € Zp. The idèles are defined similarly, and we 
obtain 


A* = Q* x RX x [] ZX, 


where ee di p € Zy if aọ Æ 0. The rationals Q embed in each Qp, and so embed 
diagonally in A (r => Z p for any prime p not dividing the denominator of r). There 
are many generalisations of A and A”, for example we can replace Q by other number 
fields. But what good are they? What have they to do with modular forms? 

There are many situations where the level of a modular form is variable. For example, 
any A € SL2(Q) takes a modular form for F(N) to one for some other T(N’) (see 
Question 2.3.4). We have natural maps from the surface ['(n)\H to any '(d)\H, when 
d divides n. Collecting together this tower of surfaces r (n)\H into a single structure 
amounts to taking the limit space f := lim- T (n)\H. Functions on Hl include ratios f8 
of modular forms of the same weight but different levels. Much as 


lim R/nZ = A/Q 


as topological groups, we get 


aA 


H = SL(Q\SL2(A)/K æ, (2.4.3) 


where Kə consists of all sequences of matrices (A, I, In, ...) where A € SO2(R) C 
SL,(R) and the /,’s are the identity matrices in each SLO p). In Section 4.3.3 we 
discover Hi naturally in nonperturbative string theory. 

Similarly, a Dirichlet character (see Section 2.3.3) can be thought of as a continuous 
one-dimensional representation on Q*\A”, and the Galois group of a finite abelian 
extension of Q can be thought of as a subgroup of Q*\A”. 

The Langlands conjectures suggest that the n-dimensional representations of the abso- 
lute Galois group Gal(K/K) of a field K (such as Q) correspond to ‘automorphic repre- 
sentations’ of GL,,(A), where A here is the group of adéles of K. This correspondence 
can be seen through the corresponding L-functions. For GL; and K = Q, this correspon- 
dence involves the Kronecker-Weber Theorem and Dirichlet characters. For GL» this 
relates two-dimensional representations of Galois groups to modular forms. A recent 
accessible introduction to the Langlands Programme is [90]. Although there are hints 
of some sort of relation between the Langlands conjectures and Moonshine in its more 
general sense, these are still too speculative to go into here. However, Section 6.3.3 may 
whet one’s appetite. 
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2.4.2 Theta functions as matrix entries 


The relationship between representation theory and modular forms discussed last section 
is quite democratic in the sense that it exists at the level of the vector space of modular 
forms. Democracy is all well and good, but we are not equally interested in all modular 
forms — some have names! 

The Jacobi theta function 63(7, z) is the unique quasi-periodic entire function, in the 
sense that any entire function f : C —> C obeying f(z+1)= f(z) and f(z+1)= 


ae~?*= for some constants t € H anda € C is a constant multiple of the function 


[o0] 
z)=1 ei =n) g7 pine 
f@=1+ 2 

For an elementary analytic proof see section 1.1 of [439]. From this uniqueness, all 
properties of 63 can be quickly derived. In this section we sketch a striking algebraic 
version of this argument. 

Starting in the 1960s, theta functions were interpreted as matrix entries in a represen- 
tation of the Heisenberg group. The motivation was pure Moonshine: 


A force d’habitude, le fait que les séries théta définissent des fonctions modulaires 
a presque cessé de nous étonner. Mais |’ apparition du groupe symplectique comme 
un deus ex machina dans les célèbres travaux de Siegel sur les formes quadratiques 
n’arien perdu encore de son caractére mystérieux. Le but de ce mémoire, et de ceux 
qui lui feront suite, n’est pas, bien entendu, d’élucider définitivement la question, 
mais de jeter un peu de lumière sur certains aspects de cette théorie qui étaient 
restés dans l’ombre jusqu’à présent. [555a]!° 


The resulting explanation of the transformation 63(—1/t) = AF 63(t) can be extended 
to many other functions arising in Moonshine. First let us sketch the basic idea, before 
giving details and generalisations. 

The starting point is the thought of realising special functions as matrix entries of 
Lie group representations. An elementary example of this involves the representation of 
S! = U, (R) as rotations in R?: 


ae ( ye: Sg ) (2.4.4) 


—sin cos 


The basic properties of sin(@) and cos(@) (e.g. angle-sum formulae, or even-oddness) 
can quickly be derived from this. We want to do something similar with 63. 

Begin by recalling the full variable dependence of 63(t, z, u), given in (2.3.4). For 
fixed u we get a Jacobi form, and for fixed t and u we get an elliptic function for the 


10 “By force of habit, the fact that theta series define modular forms has nearly ceased to amaze us. But the 
appearance of the symplectic group as a deus ex machina in the famous work of Siegel on quadratic forms 
has still lost none of its mysterious character. The goal of this paper, and of those which follow it, is not of 
course to clarify definitively the question, but rather to shed a little light on certain aspects of this theory 
which have remained in the dark up to now.’ 
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torus C/(Z + Zt). This leads us to consider two translation operators on the space of 
(say) entire functions f : C —> C, as follows. Fix t € H and define 


(S fX) = f+), (2.4.5a) 
(Ta f (z) = explria’t + 2miaz] f(z + art), (2.4.5b) 


for any a,b € R. In this way, for each fixed t € H, R? acts on the space of entire 
functions — the role of t being primarily to parametrise different isomorphisms between 
the additive groups R? and C. However, an easy calculation shows that T, and S, don’t 
commute, rather S, o Ta = exp[2xiab] Ta o Sp. So the group (Ta, Sp) generated by all 
T,’s and S,’s is the semi-direct product of S 1 with R?, consisting of all pairs [A, x] for 
à € C, |A| = 1 and x = (x1, x2) € R’, and operation 


[A, x] - [u, y] = [Au exp[2rix2y1], x + y]. 


This group is called the Heisenberg group H . Then (2.4.5) says that 63 is a vector in a 
space carrying a representation of H . Now, it turns out that all irreducible representations 
(x, H) of H are essentially isomorphic. A more natural and useful way to see 63 in any 
such representation (zr, H) is by defining a vector fy € H and distribution uz such that 
the Hermitian product 


(tfr, uz) = ce CED 65 (T, xit + x2) (2.4.6) 


for some nonzero constant c. The exponential factor on the right side of (2.4.6) simplifies 
the quasi-periodicity of the right side. 

We will see that SL2(R) acts as automorphisms on the Heisenberg group H. Hence 
for any y € SL2(R), we get a new representation z, of H by [A, x]  7,..,,x1. This 
representation must be isomorphic to zr, so there is a (unitary) operator R, on H such that 
Ty ix] = Ry oMp,x] © R% 1 The assignment y +> R, defines a projective representation 
of SL2(R) on H. Modularity of 63 now follows from the calculation 


(Ti, x] fr, Lz) — (Ry Ti, xfr Ry uz) = (ty xy fr, Ry Lz), (2.4.7) 


together with the computation of R, fy and R, uz for the y € Fo < SL2(Z). Let us now 
fill in the details. 

For reasons that will be clear shortly, it is preferable to work instead of [A, x] with the 
realisation of the group H given by all pairs (A, x) with operation 


(A, x) - (u, y) = Qu exp [Ti (X12 — x2y1)], x + y). 
The isomorphism between these realisations of H is given by the correspondence 
(A,x) <4 [A7] exp[rixıx2], x]. 


This group H is a three-dimensional real Lie group corresponding to the Heisenberg Lie 
1 0 1 


algebra Seis defined in (1.4.3). It is a quotient by Z = ( O 1 0 ) of the group H 
0 0 1 
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of upper-triangular matrices 


1 
0 
0 


Corea 


c 
b | € SLR). 
1 


H is the (unique) simply-connected Lie group with Lie algebra Heis; it isn’t important 
that we’re focusing on H rather than its universal cover H.The group H and its (2n + 1)- 
dimensional versions (the obvious extension of R?” by S!) were studied originally in the 
context of quantum mechanics, hence their name. 

The representation theory of these groups was established around 1930. Let x be a 
unitary irreducible representation of H, in a Hilbert space H. Recall from Section 1.5.5 
that this means z is a homomorphism from H into the group of unitary operators of 
H; moreover, for each f € H, the map from H to H given by (A, x) > Ta, f is 
continuous. First note that by Schur’s Lemma (the analogue here of Lemma 1.1.3), the 
central element (A, 0) € H will act in H by a scalar multiple A” for some n € Z. 


Theorem 2.4.2 (Stone-von Neumann) Let x be a unitary irreducible representation 

of H, obeying no o (f) = 4" f. 

(i) [fn ¥ 0, then x is infinite-dimensional and any other unitary irreducible 
representation x’ of H obeying 1, (f) = A" f will be unitarily equivalent to 1. 

(ii) [fn = 0, then x is one-dimensional and unitarily equivalent to (à, x) œ> e'** e C 
for some vector a € R?. 


We’re interested in the case n = 1; see, for example, theorem 1.2 in [440] for a proof 
of this special case. There are many different realisations for this unique irreducible 
representation. The simplest (sometimes called the Schrödinger representation) uses the 
Hilbert space H = L?(R). The action of (A, x) € H on f € L?(R)is given by the unitary 
operator U(x) defined by 


Vay SQ) = à exp [ri (2yx2 + x1x2)] f(y + 1). 


This is (essentially) the exponential of the defining representation (4.2.5) of Heis. Inci- 
dentally, the action of S,, Ta in (2.4.5) on entire functions extends to an n = —1 repre- 
sentation of H; this representation is anti-linearly equivalent to the Schrödinger repre- 
sentation. 

We want to recover the theta function naturally from the n = 1 representation. As 
always, ‘natural’ means free of arbitrary choices, such as a specific realisation of the 
n = | representation, or a specific basis of the underlying Hilbert space. Begin with any 
realisation (7, H) of the n = 1 representation of H. 

As we see in Section 1.5.5, a unitary representation U of a Lie group G ona space H 
induces a representation ôU (the derived module) of the corresponding Lie algebra g on 
a dense subspace Hæ of H by anti-Hermitian operators. For example, the representation 
(2.4.4) of U1 (R) acts on the Hilbert space H = L?(S!) @ L?(S') of all pairs ‘Gear To see 


8) 
how the corresponding Lie algebra u;(R) = R acts, decompose (2.4.4) into irreducibles 
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(i.e. diagonalise): 


cosð® sn@\ (1 i\'(/e® O\/(1 i 
—siné@ cosd) \i 1 Oar EN a) Iy 


Thus the Lie algebra u; (R) acts as 


x —1 - od d 
AS l i ixi pa l 1 z 2 “XG 
i 1 0 —X1G i 1 XG 0 


The domain of these operators isn’t the whole of the Hilbert space H, but it does contain 
the dense subspace consisting of the infinitely differentiable functions. 

Similarly, our representation x of H on H induces an anti-Hermitian represen- 
tation ôm of Seis on a dense subspace Hæ of H. If we write e™'4 = (1, (x1, 0)), 
e?8 = (1, (0, x2)) and etf = (e?7", 0), then using the Baker—Campbell—Hausdorff for- 
mula (1.4.6), these generators obey [A, B] = C, [A, C] = [B, C] = 0. As an example, 
in the Schrédingern = 1 representation on space H = L?(R), these become the ‘momen- 
tum operator’ ôU 4 f = a the ‘position operator’ (Ug f)(x) = 27ixf (x) and the cen- 
tral term 5Uc f = 27if. In this example, the dense subspace Ho is the Schwartz space 
S(R) (Section 1.3.1) consisting of infinitely differentiable, rapidly decreasing functions. 

We are now ready to define the two vectors fz, ez in (2.4.6). Consider the subspace 
W, consisting of all f € H for which (6774 — tézg)f is defined, and equals 0. This can 
be thought of as a holomorphicity condition 2 f = 0 (recall t corresponds to /—1). 
We know that W, will be one-dimensional for our choice of 7 , since it manifestly is for 
the Schrödinger representation U: there, W, = Cet)”, Choose any nonzero f, € W;. 

The map o(n) := ((—1)"”, n) defines a homomorphism Z? > H, and obeys (p o 
o)(n) = n for the obvious projection p : H —> R? — we say p ‘splits over Z*’. Define 
V to be the common 1l-eigenspace of all U,(,). More precisely, let V consist of all 
(tempered) distributions u € HG with the property that, for all n € Z? and all f € Hoo, 
(Tom f, u) = (f, u}. For example, in the Schrödinger representation, we must have 
e2tiny u(y +71) = u(y) forall n € Z?. Note that LO) = Xnezd(y + n) satisfies that, 
and using test functions f(y) = e?7””” it quickly follows that this j is unique up to scalar 
multiplication. Therefore, for our representation 7, V will also be one-dimensional. 
Choose any nonzero uz € V. It encodes quasi-periodicity. 

Thus we obtain, in the Schrédinger representation, 


Waa fe tz) = Coane ait SS 8G ») 
n 


which simplifies to the right side of (2.4.6) with c = 1. Therefore, by uniqueness of 7t 
and basis independence of the Hermitian product(, ), we get that (2.4.6) holds regardless 
of the realisation (7, H) and vectors fr, ug we choose. 

The reader can verify that quasi-periodicity is automatic (Question 2.4.3). The mod- 
ularity is of course more difficult (and more interesting). To do this, we need to describe 
the action of SL? (R) on the space H (which we can take to be L?(R)). 
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Any y € SL,(R) defines an automorphism of H by (A, x) => (A, y.x), by 


0 —1\ fab 0 -l1 x1 dx, — cx2 

ese G 0 ) (e a) (| 0 ) (") = (Ge a ai 
The precise form of this action is chosen so that (2.4.10a) below will involve the usual 
Möbius action of SL2(R) on H. We can twist by y and thus get a new representation 
(x', H) of H, defined by To, x) f = To.,y.x) f. Obviously z’ is also irreducible and has 
central parameter n = 1, so by the Stone—von Neumann Theorem must be unitarily 
equivalent to x. That is, there exists a unitary operator R, on the Hilbert space H that 
intertwines z and mz’: R,yz = 2'R,. The assignment y > R, is only defined up to a 
constant, and so we get a projective representation of SL2(R) on H. As we learn in 
Section 3.1.1, projective representations become true representations when we centrally 
extend. In particular, we get a true representation when we replace SL2(R) with a double- 
cover called the metaplectic group Mp2(R). 

The metaplectic group is the unique connected double-cover of SL2(R). It can be 
thought of as a way of keeping track of which branch of the square-root we’re on in 
equations like (2.3.5b), and this provides its easiest realisation. Define Mp2(R) to be 
the set of all pairs (y, s), where y € SL,(R) and s = s(t) is a choice of holomorphic 
square-root of ct + d. Since there are two choices for s (differing by a sign), this is 
indeed a double-cover. The group operation is 


(Y, SY, s(t) = (vy, s(t) s’ (T), (2.4.9) 


as can be seen by calculating from (2.3.6) with k = 1/2. 

Returning to the y-twist z’ of the representation z of H , it is possible to choose unitary 
operators R(,,s), for each (y, s) € Mp (R), such that Ry,s7 = 2’ Riy,s) and (y, s) => 
Rvs) defines a representation of the metaplectic group Mp2(R). 

Recalling the definition of f+ and uz as eigenvectors, it isn’t difficult to see that 


Rys fr = SCE) fyz, Vy, s) € Mp,(R), (2.4.10a) 
Ry, sy UZ = My,s)CZs Yy, s) € Te = {(v, 5) € Mp,(R)|y € To},  (2.4.10b) 


where y.t is the usual action (2.1.4a) and where n : Fo — C* is some one-dimensional 
representation (with values in eighth roots of unity). See chapter 8 of [440] for the 
detailed calculation. We now immediately obtain from (2.4.6) and (2.4.7) that 


ix) (Tx1+4 -1 
ce™ CAG (T, XıT + x2) = (Ta yxRy fr» R uz) = s(t) (Ta, yx) fyt» LUZA 


-1 : at+b 
=cS(T) My,s) exp | wi(dx, — cx2) (dx, — cx2) + (—bx; + ax2) 
ct+d 


at+b at+b 
x 03 , (dx, — CX2) + (—bx; + ax2)), (2.4.11) 
CT ct+d 


b 
for all y = i a] € To, which simplifies down to the desired modularity (2.3.5b). 


Last subsection we learned that SL? (R) acts transitively on H. Using this and (2.4.10a), 
we can refine (2.4.6) and write 63 as a matrix entry of a unitary representation of the 
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obvious semi-direct product of Mp2(R) with H. We obtain 


c e™ 1492) 3 (t, xit +42) = Veit+d (Tax) Ry,s) fr, HZ) (2.4.12) 


where t = pdtacti, for y = (i a € SLR). 


This argument is far longer and more technically difficult than the other proofs of 
theta function modularity given in this chapter, and it is easy to get lost in the details. 
But it is a remarkable argument, and much more conceptual than, for example, Poisson 
summation. The modular group SL2(Z) (or rather its subgroup Ig) arises here as a group 
of automorphisms of H transforming in a controlled way the vectors fy and uz. The 
intrinsically algebraic nature of the argument means it generalises easily, and with little 
extra effort we could have given the proof for Siegel theta functions. (Nonholomorphic) 
Eisenstein series can also be constructed and studied in a similar way (by first lifting 
to SL2(R)). But as with the previous modularity proofs, new ideas would be needed to 
generalise it beyond these classical functions into a general device providing uniform 
proofs of modularity for Moonshine functions. In the next subsection though we explain 
why it might after all have something to do with Moonshine. 


2.4.3 Braided #2: from the trefoil to Dedekind 


The decomposition (2.4.1b) says that SL2(R) is topologically homeomorphic to R? x S', 
i.e. the interior of a solid torus (or if one prefers, the complement of S! in R3). In 
remarkable work in the context of computing k2(Z) (see Section 2.5.1), Quillen showed 
that the space SL2(Z)\SL»(R) is naturally diffeomorphic to the complement of the trefoil 
knot in the sphere S? (see pages 84-5 of [419] for the elementary argument). Namely, 
the Eisenstein series a = G4, b = G6 in (0.1.5) identify the space GL2(Z)\GL,(R) of 
two-dimensional lattices with the complement of the complex curve 20a? — 49b? = 0 
(which corresponds to degenerate lattices); the intersection of 20a? — 49b? = 0 with the 
sphere |a|? + |b|? = 1 in C? (to get instead SL2(Z)\SL2(R)) is then identified with the 
trefoil (the (2,1)-torus knot, drawn in Figure 1.10). Now, in Section 2.4.1 we lift modular 
forms for SL2(Z) to the space L?(SL2(Z)\SL»(R)): thus, for example, the j-function 
is a complex-valued function on the complement of the trefoil. More generally, as we 
will see later, the characters of an affine algebra, or vertex operator algebra, or rational 
conformal field theory, are vector-valued functions on the complement of the trefoil. The 
cusps of H can be interpreted as rational points on the trefoil. Can modular forms and 
functions somehow see this topological trefoil? The answer is yes! 

First, the fundamental group of the complement of the trefoil is easy to compute using 
the Wirtinger presentation (Section 6.2.5), and is naturally isomorphic to the braid group 
B3. This suggests the following picture. Write G for SL2(R), G for its universal cover 
and T for SL2(Z). Then 


C= GSS 7G: (2.4.13) 
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Of course z is surjective and has kernel 7\(G) = Z. G is also the universal cover of 
the trefoil-complement I"\G, and the kernel of this surjective map q o z is the central 
extension 71 (T \G) = B3 of the modular group SL2(Z). The map B3 — SL,(Z) is simply 
the reduced Burau representation (1.1.11b) specialised to w = —1 (recall (1.1.10a)). 

So what does this mean for modular forms? Recall from Section 2.2.1 that modular 
forms for SL2(Z) have multiplier u that carries a projective representation of SL2(Z) — 
it will be a true representation only when the weight k is an integer. As we emphasise in 
Section 3.1.1, projective representations become true representations when one centrally 
extends. Especially when the weight is fractional, the role of SL2(R) really should be 
played by the more fundamental Lie group SLR), and likewise the modular group 
SL2(Z) should be replaced by its central extension 53. 

For a good example, recall the Dedekind eta function y(t) of (2.2.6b). As we see 
in (2.2.8), it is a modular form for SL2(Z) of weight 5, whose multiplier u is quite 
complicated as a function on SL,(Z). But 63 is the more fundamental transformation 
group underlying y(t). Indeed, in terms of 63, the multiplier is trivial to describe: 


271 
(6) = exp E deg J ; (2.4.14) 


where the degree of a braid is the length of its word in o1, o2 (Section 1.1.4). More 
generally, the multiplier for any modular form for SL2(Z) will be similar, with ‘24’ 
replaced by some other rational. Surely this algebraic interpretation of Dedekind sums 
in terms of 83 is related to the topological interpretation of Dedekind sums reviewed and 
explored in [24]; see also [23], [43]. 


Of course the multiplier of 7 is almost as trivial if we write ji ; € SL,(Z) as 


a monomial in the generators S, T, but finding that monomial isn’t easy. On the other 
hand, finding ‘deg $’ by looking at the braid £ is easy: just count the crossings in 8, with 
signs. The multiplier, as a function of £, is far simpler than as a function of a, b, c, d. 
Our topological considerations have been rewarded! 

Likewise, the multiplier in the vector-valued Jacobi form (2.3.8) (again of weight 
5) defines a four-dimensional projectivere presentation of SL,(Z), given by the tensor 
product of the one-dimensional representation exp[27i deg 6/8] of B3, with a true four- 
dimensional representation of SL? (Z). 

Of course the metaplectic group was introduced last subsection for essentially the 
same reason (Mp2(R) is also a quotient of SLR). Indeed, since most modular forms 
arising in the literature have weight in iZ, the metaplectic group is a large enough central 


extension, and SL2(R) may seem like overkill. But modular forms with fractional weight 
exist in abundance for arbitrarily large denominator (see e.g. [303] for examples). The 
important ‘one-point functions on a torus’ (Section 4.3.2) in conformal field theory 
(CFT), to which family the Moonshine functions naturally belong, can form vector- 
valued modular forms of arbitrary rational weight. We will see in Section 7.2.4 how 
nicely the CFT machinery accommodates this universal 63 action, and also how other 
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considerations in (Monstrous) Moonshine are trying to focus our attention on the relation 
of 5, to modular functions. 

The braid group 8; is at least as relevant for the nonholomorphic automorphic forms 
of SL2(Z), alluded to in Section 2.4.1. For a simple example, [379] studies the Maass 
cusp forms u(t) (with weight 0), identifying them with ‘period functions’ w(z); the 
exact symmetry u(—1/t) = u(t) becomes w(1/z) = z” y (z), where s is the ‘spectral 
parameter’ of u. This transformation of the w’s, with the factor z% is what one would 
expect from the braid group (compare (7.2.4)). 

We should regard B3 as the universal symmetry of (not necessarily holomorphic) 
modular forms for SL2(Z). If instead we have modular forms for some subgroup F of 
SL,(Z), then the role of B3 is replaced by its subgroup that projects (via the reduced 
Burau representation (1.1.11b) specialised to w = —1) to I’. For instance, the principal 
congruence subgroup T (2) corresponds to the pure braid group P3. It would be interesting 
to find the topological interpretation of 9(p)+ in (7.1.5) and the other modular groups 
appearing in Monstrous Moonshine. 

The lesson of Section 2.4.1 is that, whenever we have some sort of modularity for, 
for example, SL2(Z), we should lift the domain to that of the relevant Lie group (e.g. 
SL2(R)). This should be especially valuable for providing perspective and clarity when 
we are investigating a new modular-like phenomenon. To give one example among many, 
[519] introduces nonholomorphic deformations of familiar modular forms relevant to 
strings on a pp-wave background (a 1-parameter deformation of flat space-time). Of 
more direct relevance to us is the question: Zs it natural to regard the modular functions 
(characters) of RCFT, VOAs and Moonshine as functions on SL2(R)? 

The lesson of this subsection is that an SL2(Z)-action may become simpler when 
lifted to its central extension 63. The braid group provides a clean universal formulation 
especially appropriate when metaplectic groups or other central extensions of SL? (Z) 
arise. Mathematics thrives on having alternate interpretations for the same phenomemon: 
here we replace the matrix group SL»(Z) (or its subgroups) with the topologically defined 
B; (or its subgroups). Some things will be easier in one formalism, and presumably other 
things in the other (e.g. the multipliers u are much easier for $3). It is tempting to apply 
this to the so-called S-duality of superstrings (Section 3.2.5). Are there other ways 
modular forms for SL2(Z) see the trefoil? 

The modularity argument of Section 2.4.2 has never been applied to Monstrous Moon- 
shine, to this author’s knowledge. But one hint that it might be the shadow of such a 
device is that the braid group lurks here. In particular, there is an action of 63 on G x G, 
for any group G (Question 2.4.4); the action (2.4.8) of SL2(Z) on H is really this action 
of B3 on R? — it factors through to SL2(Z) because R? is abelian. In Section 7.3.3 
we use this same action, this time applied to M x M, to identify the group-theoretic 
property of the Monster M that could be responsible for the genus-0 properties of the 
McKay—Thompson series T,. 

Another hint, perhaps more substantial, of its relevance to Moonshine-like phenomena 
is the repeated appearance of Maslov indices in the study of gluing anomalies in three- 
dimensional topological field theory (see chapter IV of [534]). This suggested to Turaev 
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an intimate relation of topological field theory with the Segal-Shale—Weil representations 
of the metaplectic groups. These representations also appear in the context of braids and 
subfactors [252] — metaplectic representations arise naturally there when constructing 
knot invariants from braids. Much of the mathematical background is developed in [387], 
where we also learn that the universal cover SLR) can easily be expressed using Maslov 
indices. 


Question 2.4.1. Use the decomposition (2.4.1b) to find a (noncanonical) group structure 
on H, inherited from that of SL2(R). 


Question 2.4.2. Show that uniqueness of the representation in Theorem 2.4.2 fails if 
H is replaced with infinitely many coupled Heisenberg groups. (This is a major com- 
plication for quantum field theory, as we see in Section 4.2.2 in the context of Haag’s 
Theorem.) 


Question 2.4.3. Verify that any function of the form F(x) = (xa,x) f, uz), for any f for 
which F is defined, necessarily obeys F (x + n) = (— 1972 eZ 0x2=m) F (x), Hence uz 
is responsible for the quasi-periodicity (2.3.5a) of 63. 


Question 2.4.4. (a) Let G be a finite group. Verify that we obtain a right braid group B; 
action on the Cartesian product G x G x G, by defining 


(8, h, k).o1 = (ghg"', 8, k), (8, h, k).o2 = (8, hkh™', h), (2.4.15a) 


where o; are the usual generators of 53 (recall (1.1.9)). Also, verify that there is a right 
B3-action on G x G, generated by 


(g,h).o1 = (8, gh), (g,h).02 = (gh, h). (2.4.15b) 


(b) Let C C G x G consist of all pairs (g, h) where gh = hg. Show that this 63 action 
takes C to itself, and that its restriction to C actually defines an action of SL2(Z) 
onc. 

(c) Extend the B; actions of (a) to B, actions on G” and G""!. 


Question 2.4.5. (a) Show that SL2(R) is isomorphic to the group 
SU},1(C) := 16 ry e SLO l 
p @ 
by showing they are conjugate in SL2(C). 
(b) Verify that SU;,;(C) is isomorphic to the set of all pairs (y, 0), where |y| < 1 and 
—1 <6 < 1, with group operation (y, 6)(y’, 6’) = (6”, 6”) where 


1 5-200 1 1 5-200 
We y+ye 6" =0 +08 + sige es 


z 1+ Vy'e— 270 Oni 1+ Vy'e- 270 


(mod 2). 


(c) Using (b), realise the universal cover SLR) of SL: (R). 
(d) Realise 63 as a subgroup of SL2(R). 
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2.5 Meta-patterns in mathematics 
2.5.1 Twenty-four 


There are lots of ‘meta-patterns’ in mathematics, i.e. collections of seemingly different 
problems that have similar answers, or structures that appear more often than we would 
have expected. Once one of these meta-patterns is identified it is always helpful to under- 
stand what is responsible for it, to see what simple structure or basic lemma underlies it. 
Why are groups so important in mathematics and science? Because they are the devices 
through which we ‘act’ on sets, spaces, etc. Mathematics is not above metaphysics; 
like any area it grows by asking questions, and changing one’s perspective — even to a 
metaphysical one — should suggest new questions. 

To give a trivial example, years ago while the author was writing up his PhD thesis 
he noticed in several places the numbers 1, 2, 3, 4 and 6. For instance cos(2zr) € Q for 
r € Q iff the denominator ofr is 1, 2, 3, 4 or 6. Likewise, the theta function ©z,,(t) for 
r € Q can be written as $` a;03(b;T) for some a;, b; € R iff the denominator ofr is 1, 2, 
3, 4 or 6. This pattern is easy to explain: they are precisely those positive integers n with 
Euler totient ġ(n) < 2, that is there are at most two positive numbers less than n coprime 
to n. The various incidences of these numbers can usually be reduced to this @(n) < 2 
property. For example, the number field Q[cos(27 pI (see Section 1.7.1), considered as 
a vector space over Q, has dimension $(b)/2. 

A more interesting meta-pattern involves the number 24 and its divisors (especially 
8). One sees 24 wherever modular forms naturally appear. For instance, we see it in 
the critical dimensions in string theory: the bosonic string lives in a background space- 
time of dimension 24 + 2, while the fermionic string lives in 8 + 2 dimensions. Another 
example: the dimensions of even self-dual positive-definite lattices must be a multiple 
of 8 (e.g. the Eg root lattice has dimension 8, while the Leech lattice has dimension 24). 
The meta-pattern 24 is also easy to understand: the fundamental problem for which it 
is the answer is the following one. Fix n, and consider the congruence x? = 1 (mod n). 
Certainly in order to have a chance of satisfying this, x and n must be coprime. The 
extreme situation!! is when every number x coprime to n satisfies this congruence: that 
is, 

gcd(x,n) = 1 => x? =1 (mod n). (2.5.1) 


The reader can try to verify the following simple fact: n obeys this extreme situation 
(2.5.1) iff n divides 24. What does this congruence property have to do with these other 
occurrences of 24? The elementary argument for even self-dual positive-definite lattices 
involves the construction L{T } of Section 2.3.3 and is sketched in Question 2.5.1. 

The ‘24’ appearing in the g!/*4 of 7 is the same as the 24 in c/24 appearing in, 
for example, (3.1.10); in both cases they come from ¢(—1) = —1/12 or equivalently 


l1 This is a standard trick in mathematics: when some sort of bound is established, look at the extremal cases 
that realise that bound. If your bound is a good one, it should be possible to say something about those 
extremal cases, and having something to say is always of paramount importance. This strategy is used, for 
instance, in the definition of normal subgroup in Section 1.1.1 and of simple-currents in Section 6.1.1. 
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¢(2) = 1/6. Are these the same as the 24 in (2.5.1)? Note from the right side of (2.2.6b) 
that 


n(t) = Oz, 1 (121) — Oz, 5 (127). 


Using this identity, the fact that n(t + 1) is a constant multiple of n(t) is indeed related 
to (2.5.1). Moreover, this ‘1/24’ is directly related to the abelianisation 


SL2(Z)/[SL2(Z), SLo(Z)] = Z2 (2.5.2) 


of SL,(Z): writing n(—1/t)? =atn(t) and n(t + 1)? = bn(t)’, the multiplier s b> 
a,t |> b must define a one-dimensional representation of SL2(Z), since n? has weight 
1 (recall Question 2.2.3); for any group G, and in particular SL2(Z), the abelianisation 
G/[GG] is isomorphic to the group of all one-dimensional representations of G. This 
argument forces b to be some12th root of unity, and a to be b°. 

Perhaps the most intriguing ‘24’ occurs as a K-theoretic invariant of the integers. K - 
theory is a generalised (co)homology theory, and as such associates a sequence of abelian 
groups K;(X) to the object X, which can capture some subtle aspects of X. When X is 
a ring, the definition of these invariants K;(X) is quite involved, and their calculation 
is very difficult (see e.g. [419] — for example, for X = Z the groups are known only 
for 0 < i < 5, where they equal Z, Z2, Z2, Zag, 0, Z, respectively. Ko(Z) = Z says that 
the projective Z-modules are the free Z-modules Z”, while Kı (Z) = Z, tells us that the 
Euclidean domain Z has only two units (namely, +1). The first interesting group in this 


list is Z4g, which arises naturally here as an extension of Z24. Thus 24 (or 48) is a number 
intimately associated with Z. This author knows no direct connection with our definition 
(2.5.1) of 24, but there is a conjectural relation of || K4,—2(Z)jorsion||/\| Kan—1(Z)rorsion|| 
with values ¢(1 — 2n) of the Riemann zeta function (see e.g. [230a]). In particular, 


K3(Z) = Zag is related to ¢(—1) = — > which in turn is related to our 24. 


2.5.2 A-D-E 


A much deeper and still not-completely-understood meta-pattern is called A-D—E (see 
[16] for a discussion and examples). The name comes from the simply-laced Lie algebras, 
i.e. the simple finite-dimensional Lie algebras whose Coxeter—-Dynkin diagrams — see 
Figure 1.17 — contain only single edges (i.e. no arrows). These are the A,- and D,-series, 
along with the E6, E7 and Eg exceptionals. The observation is that many other problems, 
which don’t have anything directly in common with simple Lie algebras, have a solution 
that falls into this A-D—E pattern. Of course, for an object to be meaningfully labelled 
Xu at least some of the data associated with the algebra X ¢ should reappear in some form 
in that object. Let’s look at some examples. 

Consider any even positive-definite integral lattice L (Section 1.2.1). The smallest 
possible nonzero length-squareds in L will be 2, and the vectors of length-squared 2 are 
special and are called roots (Question 1.2.5). It is important in lattice theory to know 
the lattices that are spanned by their roots; it turns out these are precisely the orthogonal 
direct sums of lattices called A,, D, and E6, E7 and Eg (Theorem 1.2.2). They carry 
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those names for a number of reasons. For example, the lattice called X,, has a basis 
{a,,...,@,} with the property that the Gram matrix A;; := a; - œj is the Cartan matrix 
(see Section 1.4.5) for the Lie algebra X,,. Also, the group generated by reflections in 
the roots of the lattice X, is naturally isomorphic to the Weyl group of the Lie algebra 
X n. Moreover, to any simple Lie algebra there is canonically associated a lattice called 
the root lattice; for the simply-laced algebras, these are isomorphic to the lattice of the 
same name. Incidentally, the root lattices for the non-simply-laced simple Lie algebras 
are (up to rescalings) orthogonal direct sums of the simply-laced root lattices. 

A famous A-D-—E example is due to McKay.!* Consider any finite subgroup G of 
the Lie group SU2(C) (i.e. the 2 x 2 unitary matrices with determinant 1). For example, 
there is the cyclic group Z, of n elements generated by the matrix 


_ ( exp[2zi/n] 0 
Mys ( 0 e 


There are also the (doubles of) dihedral groups D,, and the binary tetrahedral, binary 
octahedral and binary icosahedral groups of orders 24, 48 and 120, respectively. Let 
R; be the irreducible representations of G. For instance, for Z,, there are precisely n 
of these, all one-dimensional, given by sending the generator M, to exp[27ik/n] for 
each k = 1, 2, ... , n. Now consider the tensor product G ® R;, where we interpret G C 
SU2(C) here as a two-dimensional representation. By Theorem 1.1.2 we can decompose 
that product into a direct sum @ jm;; R ; of irreducibles (the m;; here are multiplicities). 
Now create a graph with one node for each R;, and with the ith and jth nodes (i Æ j) 
connected with precisely m;; directed edges i —> j. If mj; = m ji, we agree to erase the 
double arrows from the m;; edges. Then McKay [411] observed that this graph, for any 
of these finite G < SU2(C), is a distinct extended Coxeter-Dynkin diagram of A-D—E 
type (these are all listed in Figure 3.2). For instance, the cyclic group with n elements 
yields the extended graph of A,_1. 

How was McKay led to his remarkable correspondence? He knew that the sum of the 
labels a; = 1, 2,3, 4,5, 6, 4, 2, 3 associated with each node of the extended Eg diagram 
(Figure 3.2) equals 30, the Coxeter number of Eg. So what do their squares add to? 
120, which he recognised as the cardinality of one of the exceptional finite subgroups of 
SU2(C), and that got him thinking... 

A deep example of A-D—E , due to Arnol’d, are the simple singularities. A singularity 
or critical point of a smooth function f : C” + C is a point z € C” where all first partial 
derivatives ð; f vanish. For example, f(z) = z‘*! hasa singularity atz = 0 for any integer 
k > 1. We identify singularities if locally they merely differ by a change-of-coordinates — 
see, for example, [19] for details. For example, any singularity of f : C —> C is equivalent 
to one of the form f(z) = z‘*!. A simple singularity is an isolated singularity and behaves 
like the poles f(z) = z7” of usual complex analysis — again see [19] for the precise 
definition. For example, zi + ant is simple but zt + 32373 + Zs is not (the coefficient 
*3’ can be deformed, yielding a continuum of inequivalent singularities). 


12 He is the same John McKay we celebrated in Chapter 0. 
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Table 2.2. The simple singularities in C? 


Name Ar Dy E6 Er Eg 


Representative x? + yl x?y + ye! x? + yt x? + xy3 xy 


Table 2.2 lists the simple singularities in C? up to equivalence. In higher dimensions we 
get the same list, with the extra variables coming in as 75 +-+- + z2. These singularities 
can be related to McKay’s A-D-—E as follows. The group SU2(C) acts on C? in the 
obvious way (matrix multiplication). If G is a discrete subgroup of SU2(C), then consider 
the ring of polynomials in two variables w,, w2 invariant under G. It turns out it will 
have three generators x(w1, w2), y(W1, W2), z(w1, W2), which are connected by one 
polynomial relation (syzygy). For instance, take G to be the cyclic group Z,, then 
we’re interested in polynomials p(w 1, w2) invariant under w; +> exp[277i/n]w1, w2 œ> 
exp[—271/n]w2. Any such invariant p(w), w2) is clearly generated by (i.e. can be written 


as a polynomial in) w;w2, w} and w3. Choosing instead the generators x = a=, 


y = ww, Z = j view we get the syzygy y” = —(x? + 2). For any G, generators 
x, y, z can always be found so that the syzygy will be one of the polynomials in Table 2.2 
(with ‘+-z?’appended), and this will give the equation of the algebraic surface C?/G asa 
two-dimensional complex surface in C?. For example, the complex surfaces C? /Z,„ and 
{(x, y,z) € C3 |x? + y? + z” = 0} are equivalent. 

There are other ways these singularities can be associated with A-D—E. Given a 
surface © C C? with a single singularity, a resolution Č is a smooth surface without 
singularities that agrees with © away from the singularity (again see [19] for details). A 
minimal resolution is one through which any other resolution must factor. The minimal 
resolution exists and is unique. For example, the A; singularity x? + y? + z? = 0 has 
the resolution 


= ={(x, y, z, (a, b)) € Œ? x P'(C) |x? +y? +2? =0, xb = ya}. 


For (x, y) # (0, 0), xb = ya uniquely determines the homogeneous coordinates (a, b), 
but the singularity (x, y) = (0, 0) is blown up into the sphere P! (C); the points on the 
sphere parametrise the different (complex) directions in which the singularity can be 
approached. 

More generally, given a minimal resolution v : = Xofa simple singularity, z~! (0) 
will be a union of r spheres UC;. duVal [165] noticed that these classes [C;] form a basis 
of the homology group H: Ab , Z), on which there is defined a Z-valued intersection form; 
this form makes HÈ, Z) into a negative-definite lattice isomorphic (up to a factor of 
»/—1) to the root lattice of X,, where [C;] map to a basis of simple roots. The Weyl 
group of X, is isomorphic to the so-called monodromy group of the singularity (see [19] 
for details). 

Incidentally, the McKay correspondence refers to the strategy of describing the geom- 
etry of the resolution of the orbifold singularities C” /G for finite subgroups G of SL,,(C), 
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Fig. 2.9 The connected multigraphs with largest eigenvalue 2. 


through the representation theory of G. See [254] for the n = 2 story (i.e. for the sim- 
ple singularities) and [471] for fascinating speculations on what happens in dimension 
n>2. 

Arguably the first A-D—E classification goes back to Theaetetus, who classified the 
regular solids in 400 B.c. For instance, the tetrahedron can be associated with E6 while 
the cube is matched with £7. This A-D—E is only partial, as there are no regular solids 
assigned to the A-series, and to get the D-series one must look at ‘degenerate regular 
solids’, that is the regular polygons. 

The closest we have to an explanation of the A-D—E meta-pattern would seem to 
be graphs of small eigenvalues. Consider any multigraph G — that is, we allow multiple 
edges (there can be more than one edge connecting two vertices) and loops (an edge 
running from a node to itself), but all edges are undirected. We can also assume without 
loss of generality that G is connected. Assign a positive number a; to each node. If this 
assignment has the property that for each i, 2a; = }_ a; where the sum is over all nodes 
j adjacent to 7 (counting multiplicities of edges), then we call it ‘PF2’. The column 
vector (a1, . . . , an)! will be a strictly positive eigenvector (called the Perron—Frobenius 
eigenvector) of the adjacency matrix of G, with eigenvalue 2. A multigraph has a PF2 
assignment iff the eigenvalue A of its adjacency matrix with largest absolute value |A| is 
à = 2 (see Theorem 2.5.1 below). For instance, for the multigraph o=o, corresponding 


> ; , the assignment a; = 1 = a is PF2 but the assignment 
a, = l,a = 2 is not. The question is, which multigraphs have a PF2 assignment? The 
answer is given in Figure 2.9. The names Ay‘? to E6™® there come from Figure 3.2; 
the names °A? and D? are invented. We see that the PF2 multigraphs without loops 
are precisely the extended Coxeter-Dynkin diagrams of A—D—E type, and their PF2 
assignments are unique (up to constant proportionality) and are given by the labels a; of 
the corresponding affine algebra (i.e. the numbers attached to the graphs in Figures 2.9 
and 3.2). 


to adjacency matrix 
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The unextended diagrams have a similar depiction. For them, we assign positive 
numbers a; to each node so that 2a; > >> jap where as before we sum over all adjacent 
j. We also require that for at least one vertex 7, we don’t get an equality. Call this a 
PF2~ assignment. A multigraph G has a pF2~ assignment iff the absolute value |À] of 
each eigenvalue A of its adjacency matrix is < 2. In Figure 1.4 we list all multigraphs 
for which there is a PF2~ assignment. 

Perron—Frobenius theory studies the eigenvectors/eigenvalues of nonnegative matri- 
ces. We revisit this theory elsewhere in the book. The basic result is: 


Theorem 2.5.1 (Perron—Frobenius) Let A be ann x n matrix with real nonnegative 

entries Ajj >O(1 <i, j <n). 

(a) Let p(A) := max,|A| be the maximum of the absolute values of the eigenvalues of A. 
Then p(A) is itselfan eigenvalue of A, called the ‘Perron—Frobenius eigenvalue’, and 
it has an eigenvector (a1, ...,Qn)' > O(i.e. eacha; > 0), called a ‘Perron—Frobenius 


eigenvector’ . 
(b) If it is not possible to simultaneously permute the rows and columns of A so that A 


takes the form 
B C 
a. 


for submatrices B,C, D (such a matrix A is called ‘irreducible’ ), then the Perron— 
Frobenius eigenvector is strictly positive and is unique up to scalar multiples. 

(c) Suppose A is irreducible in the sense of (b), and B is ann x n matrix obeying 
0 < Bi < Aij Vi, j. Then p(B) < p(A), with equality iff B = A. 


See, for example, [420] for a proof and further results of this kind. In our case A is the 
adjacency matrix of a connected multigraph and so, being symmetric, is irreducible in 
the sense of (b). The classification of all PF2 and PF2~ multigraphs follows by repeatedly 
applying Theorem 2.5.1(c) (see Question 2.5.2). 

What do eigenvalues have to do with the other A-D—E classifications? Consider a 
finite subgroup G of SU2(C). Take the dimension of the equation G @ R; = @jmj;R;: 
we get 2d; = )> jmi jdj, where d; = dim(R j). Hence the dimensions of the irreducible 
representations define a PF2 assignment for each of McKay’s graphs, and hence those 
graphs must be of A-D—E type (provided we know m;; = mj; and m;; = 0). 

Or consider lattices: let a; be a basis of a positive-definite lattice, with all norm- 
squareds a; - a; = 2. Then by the Cauchy—Schwarz inequality, œ; -œj € {0,1} for 
i # j.Fori < j,ifa;-a; = +1 then replace a; with a; — a;. What this means is that 
we can assume that each a; -a; € {0, —1} fori A j. Put Aj; = a; -a; and B = 2] — A. 
Then B is asymmetric N-matrix with zeros down the diagonal, and is easily seen to have 
Perron—Frobenius eigenvalue < 2. Thus B falls into the A-D—E pattern. 


Suggestion There are two different, though related, fundamental A-D-E patterns: 
namely, the PF2 and PF2~ multigraph classifications. Any other instance of an A-D-E 
pattern reduces to one or the other of these. 
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Fig. 2.10 The tree corresponding to p = 3, q =4,r = 5. 


This suggestion should be treated with some caution — as simple singularities illustrate, 
the same area may realise both types of A-D—E patterns, depending on the specific ques- 
tions asked. In particular, duVal corresponds to Figure 1.4 and McKay to Figure 2.9. 
What relates these is that one of the nodes in the McKay graph (namely, that correspond- 
ing to the identity) is distinguished, and when it is deleted duVal’s graph is recovered. 
We return to singularities in Section 3.2.5. 

We encounter other A~D—E’s later in this book. One of these (Theorem 6.2.2) is the 
only instance of A-D—E known to this author that hasn’t yet been related to PF2 or 
PF2”. 

Incidentally, it is commonly suggested that a possible explanation for A-D-E may 
be the set of all triples p, q,r € N for which 


| eres ae | 
Spoto sk (2.5.3) 
P q r 
Then (1,q,r), (2,2,r) and (2, 3, 3), (2, 3, 4), (2, 3, 5) (corresponding to Ag4,-1, 
D,+2, E6,7,8, respectively) exhausts all solutions except for p = 1, q + r. However, this 
is not as fundamental as the graph explanation suggested above. In particular, given any 
triple obeying (2.5.3), construct the tree consisting of three strings leaving a common 
central vertex, of lengths p — 1, q — 1,r — 1, respectively (see Figure 2.10). Give this 
graph the assignment indicated in the figure — that is, label the ith vertex from the end of 
the first (respectively second, third) string A (respectively A i), Then inequality (2.5.3) 
is precisely the statement that this assignment is PF2~, and thus that the graph will be 
of (unextended) A-D—E type. The reverse implication, showing that any PF2~ graph G 
will necessarily correspond to a triple obeying (2.5.3), is much less elementary. 

What comes after A-D—E? Natural candidates should be the graphs with largest 
eigenvalue p = 3, say. For the same reason that those with p = 2 arise in so many 
contexts, those with o = 3 surely will too. The difference is that the number and variety 
of graphs grows dramatically with the largest eigenvalue p. The list of graphs with 
p = 2 has such a simple and tight structure that different situations will automatically 
share a family resemblance, provided only that they depend critically on graphs with 
p = 2. For instance, the eigenvalues of any graph with p = n must be character values 
of an n-dimensional representation of SU,, if the graph is to have a chance at being the 
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McKay graph of a finite subgroup of SU,,; although this is automatic for n = 2, it is a 
severe constraint for n > 3. A different o = 3 situation can carry with it its own severe 
constraints, which would thus overwhelm the presence of the pọ = 3 graphs. We could 
say that pọ = 2 is a dominant gene, while p = 3 is recessive; this is why A—D-E is so 
ubiquitous, and why there seems to be no effective successor meta-pattern to A-D-E. 
(But see Section 6.3.2.) 

For a final meta-pattern, consider ‘modular functions’. After all, they appear in many 
places and disguises. Maybe we shouldn’t regard their ubiquity as fortuitous. Instead, 
perhaps there’s a deeper common ‘situation’ that is the source for that ubiquity. Two- 
dimensional lattices, perhaps? Riemann surfaces? The braid group 63? 


Question 2.5.1. Let L C R” be an even self-dual n-dimensional lattice. Assume there 

exists an orthonormal basis e; of R” and a number k such that the orthogonal lattice 

2‘ (Ze; ®--- ® Zen)is a sublattice of L (this is true for any self-dual L — see theorem 3.15 

of [238]). 

(a) Let L’ be the orthonormal lattice Ze; @ --- ® Zen. Then the abelian group L/(L N 
L’) must be isomorphic to Zy X +++ X Zom forO < km < +++ < kı < k. Generators 


w1, ..., @m E L for it can be chosen so that w; = z ae, wije j, where wij € Z, such 
that X`; ciw; € L’ forc; € Ziff 24i divides c; for each i . Prove that there exist vectors 
Tipsen yE L’ such that r; - wj = 545i (mod 1). 


(b) Let x = X; 2" ~*r?@; = x), xje;, sox € L and x; € Z. Prove that each x; is 
odd (Hint: consider 7; wijrj — e;). 
(c) Conclude from (2.5.1) that 8 must divide the dimension n. 


Question 2.5.2. Using Theorem 2.5.1(c), prove that the multigraphs in Figures 1.4 and 
2.9 exhaust all connected multigraphs whose eigenvalues A all obey |A| < 2. 


Question 2.5.3. Why are there no loops in the McKay graph corresponding to any finite 
subgroup of SL2(C)? Why don’t these McKay graphs have directed edges? 


Question 2.5.4. The classifications in Figures 1.4 and 2.9 depend on the requirement 
that the matrices be symmetric, i.e. that the multigraphs have no arrows. Find all 2 x 2 
nonnegative integer matrices whose eigenvalues A all obey |A| < 2. 


3 


Gold and brass: affine algebras and generalisations 


This chapter introduces the nontwisted affine algebras — infinite-dimensional Lie algebras 
of considerable mathematical and physical interest — and searches for generalisations 
that preserve and enhance those special features. The affine algebras supply classic 
examples of Moonshine, in that the characters of their integrable modules are vector- 
valued Jacobi functions for SL: (Z). They thread through the remainder of the book, 
guiding all subsequent mathematical developments. Their Lie groups are discussed in 
Section 3.2.6. 

Algebraically, the affine algebras naturally generalise to the Kac-Moody algebras 
(Section 3.3.1), although that generalisation seems to lose some of their magic. In turn, 
the Kac—Moody algebras generalise naturally to the Borcherds-Kac—Moody algebras 
(Section 3.3.2), which play a significant role in Borcherds’ proof of Monstrous Moon- 
shine through their denominator identities (Section 3.4.2). Two other natural generalisa- 
tions of affine algebras are described elsewhere in Section 3.3. In Section 3.4.1 we study 
an important special case of what we later call the orbifold construction, and in the final 
subsection we touch on a more recent and tangential development. 

The Virasoro algebra (Section 3.1.2) plays a prominent structural role in conformal 
field theory (Chapter 4) and vertex operator algebras (Chapter 5); its relation to moduli 
spaces is a fundamental source of Moonshine itself. 


3.1 Modularity from the circle 
3.1.1 Central extensions 


Let V be any (complex) vector space, and let GL(V ) denote the group of all invertible 
linear maps V —> V. A projective representation of a group G is a map P : G > GL(V) 
such that P(e) = I (the identity), and given any elements g, h € G, there is a nonzero 
complex number a(g, h) such that 


P(g) P(h) = alg, h) P(gh). (3.1.1a) 


We call P an a-representation. So just as a (true) representation is a group homomor- 
phism R : G —> GL(V), a projective representation defines a group homomorphism P 
from G into the projective group PGL(V) := GL(V)/{C*J} (hence the name); con- 
versely, given a homomorphism x : G —> PGL(V), arbitrarily choosing a ‘section’, that 
is a representative P(g) € GL(V) in each equivalence class x(g) € PGL(V ), defines a 
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projective representation of G. A projective representation P is a true representation iff 
a(g,h)=1forallg,heG. 

Projective representations are plentiful. For example, the multiplier u in Defini- 
tion 2.2.1 is a projective representation of SL2(Z) whenever the weight k is rational. 
Quasi-periodicity (2.3.5a) is a projective representation of the abelian group C? on the 
space of functions f : C > C. In quantum physics (Section 4.2) the state of a system 
is completely described by a nonzero vector v in a Hilbert space. However, any nonzero 
multiple Av describes a physically identical state. Thus projective representations arise 
naturally also in quantum physics, where they are called ‘ray representations’. 

Note that associativity 


ath, k)a(g, hk) P(ghk) = P(g) (P (h) P(k)) = (P(g) P(A)) P (k) 
= a(g, h)a(gh, k) P (ghk) 


tells us that 
ath, k)a(g, hk) = a(gh, k) a(g,h), Vge,h,keG. (3.1.1b) 


This equation may remind the reader of a two-cocycle condition, hinting of the relevance 
of cohomology. Indeed, this function a : G x G —> C* is called a 2-cocycle and group 
cohomology organises the projective representations. 

Two projective representations P; : G —> GL(V;) are (linearly) equivalent if there 
is a vector space isomorphism g : V; — V2 such that g! o Pi og = Py. Equivalent 
projective representations must have the same 2-cocycle a. For a given a, the number of 
inequivalent irreducible œ-representations of G equals the number of conjugacy classes 
of a-regular elements g € G (g is called a-regular if a(g,h) = a(h, g) for all h € 
Cg(g)). Hence this number is at most the number of inequivalent irreducible true G- 
representations. 

We call projective representations P; : G — GL(V;) projectively equivalent when 
there is a function 6 : G — C* and a vector space isomorphism ø : V; — V2 such that 


y '(Pi(g(g))) = B(g) P(g), = =Veg EG. 


The 2-cocycles of projectively equivalent projective representations are related by 


a(g, h) = a1(g,h) B(gh) B'(g) B (nh). 


B plays the role of acoboundary, so the 2-cocycles œ; of projectively equivalent projective 
representations lie in the same cohomology class [a] € H 2(G, C*), and H7(G, C*) clas- 
sifies the projectively inequivalent projective representations. H?(G, C*) is an abelian 
group, called the Schur multiplier, and is finite when G is finite. The point of converting 
a problem into algebraic topology is that machinery (and experts!) are available to help 
compute these groups. For example, H?(Z,, CX) = H?(SL»(Z), C*) = H?(M, C*) = 
{0} while H*(Co,, C*) = Z2. This implies, for instance, that any projective representa- 
tion of the Monster M is projectively equivalent to a true representation of M. 
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Projective representations of Lie algebras are defined similarly: P : g — End(V) is 
linear, and equations (3.1.1) become 


[P(x), PO) = P(lx, y) +c, y), (3.1.2a) 
c(x, y) = —c(y, x), (3.1.2b) 
c([xy], z) = e(Lyz], x) + c([zx], y) = 0, (3.1.2c) 


where the 2-cocycle c is complex-valued and 7 is the identity endomorphism. 

Geometrically, projective representations often arise from the following fundamental 
construction. Let £ — M be any line bundle with connection V over some manifold M 
(Section 1.2.2). Let g : g — Vect(M) be a homomorphism from some Lie algebra g to 
the Lie algebra of vector fields on M. The map x +> Vọ&), Sending x € g to the covariant 
derivative in the direction g(x), associates with each x € g a differential operator on the 
space of sections of £. Since 


[Vx, Vy] = Vix, + R(X, Y)I 


for each vector field X, Y , where R is the curvature of the connection, this map defines a 
projective representation of g on the space T (£) of sections of £, with cocycle c = R. As 
we will see later this chapter, the central extensions of both the Witt and the loop algebras 
can be interpreted in this way [13]. This construction is well known in physics, where it 
falls under the slogan ‘curvature is a local anomaly’ (by contrast, global anomalies are 
monodromy effects like modularity). 

A standard trick (central extensions) converts projective representations into true rep- 
resentations. Let G be any group, and let A be any abelian group. By a central extension 
G of G by A, we mean that A can be identified with a subgroup of the centre of G ; 
and the quotient G /A is isomorphic to G. For example, the dihedral group D4 is a 
central extension (by Z2) of a central extension (by Z2) of a central extension (by Z2) 
of {e}. 

Let P be a projective representation of a group G, and assume for simplicity that 
no operator P(g) is a scalar multiple a I of the identity. Let G be the group consisting 
of all operators a P(g), for a e C* and g € G. Then G is a central extension of G by 
C*, and G is defined by a faithful representation in V. The projective representation 
of G has been transformed into a true representation of the larger group G. The spe- 
cific situation for finite groups and the most common finite-dimensional Lie groups is 
simpler: 


Theorem 3.1.1 (a) Let G be a finite group. Then there is a central extension G of G by its 
Schur multiplier H*(G, C*), with the following property: any projective representation 
P : G —> GL(V ) of G lifts to a true representation P:G > GL(V) of G. 

(b) Let G be a connected, finite-dimensional semi-simple Lie group over R or C, and 
let G be its universal cover group (which is a central extension of G by the fundamental 
group 1\(G)). Then any continuous finite-dimensional projective representation P : 
G — GL(V) of G lifts to a true representation P:G > GL(V) of G. 
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Conversely, a true representation of G restricts to a projective representation of G. The 
central extension G in Theorem 3.1. 1(a) is a finite group (e.g. for Co o; take the Conway 
group Coo), and in (b) is a Lie group of the same dimension as G (see Theorem 1.4.3). For 
Lie groups there is a topological (77, ) as well as cohomological (H?) obstacle to the trivi- 
alisation of projective representations. The assumption in (b) that G be semi-simple was 
made only to guarantee that the Schur multiplier of G would be trivial. The conclusion to 
Theorem 3.1.1(b) also holds for certain non-semi-simple Lie groups, such as the Poincaré 
group important to relativistic physics. On the other hand, the Galilei group, which plays 
the same role in pre-relativistic physics, has nontrivial Schur multiplier. In this case, the 
relevant cover will be a Lie group of higher dimension. The simplest example of this 
phenomenon is the additive group C?, and its central extension the three-dimensional 
Heisenberg group (see Question 3.1.3). It is through projective representations of C? 
that the Heisenberg group and algebra arise in both theta functions (Section 2.4.2) and 
quantum physics (Section 4.2). Similarly, the Galilei group must act on nonrelativis- 
tic wave-functions (i.e. solutions to the Schrödinger equation (4.2.1)) projectively — 
this is a consequence of the nontriviality of the Schur multiplier of the Galilei group 
(Question 4.2.1). 

Incidentally, the Schur multiplier H?(G, C*) of a finite group G appears in another 
context. Consider any presentation of G, with say m generators and n relations. The 
finiteness of G requires that m < n. The Schur multiplier of G is a finite abelian group, 
so let h be its number of generators as in Theorem 1.1.1. Then n — m > h. 

We are primarily interested in one-dimensional central extensions g of Lie algebras 
g, that is a vector space § = g @ CC together with the brackets 


lab new = [ab]oa + c(a, BC, (3.1.3a) 
wel = 0. (3.1.3b) 


The element C is called the central term. Equivalently, we have 


0—C—F— 9-0, (3.1.3c) 


together with the requirement that the image CC of C in G is in the centre of §. The short 
exact sequence (3.1.3c) says that there is an ideal in g (namely the image of the second 
arrow) isomorphic as a Lie algebra to C, and that when this ideal is projected out (by the 
third arrow) we recover g. 

The exact sequence (3.1.3c) has the charm of not requiring an explicit splitting of 
§ into a g-part g (namely, a lift of the Lie algebra g onto a subspace g) and a C-part 
CC. The point is that there are many possible splittings: for example, given any such 
splitting 9 = g @ CC, choose a linear map f : g — C; then a new splitting is obtained 
by replacing the subspace g with the span of thea + f(a) C, asa runs through g. Modern 
mathematics abhors arbitrary choices, and so would encourage us to delay the choice of 
such a splitting as long as Good Fortune permits. Of course this is merely the current 
century-long fad, and there are advantages and disadvantages to it, and indeed physics 
prefers the opposite choice. 
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p will be a Lie algebra iff the function c : g x g > C obeys (3.1.2b), (3.1.2c); as 
before, c is called the 2-cocycle associated with the extension (3.1.3). The trivial 2- 
cocycle c = 0 always works, in which case g is merely the Lie algebra direct sum g @ C. 

We regard two extensions 91, §2 as equivalent if there is a Lie algebra isomorphism 
Y : G1 — D that sends the ideal CC, of Jı onto CC2 C g2. One way (but not the only 
way) to get equivalent extensions is to change the splitting § = g @ CC, as mentioned 
before. In the language of Lie algebra cohomology (see e.g. [183] for a mathematical 
treatment, or [27] for a physically motivated one), f : g —> C is a 2-coboundary, and the 
resulting 2-cocycles c1, c2 define the same class in the cohomology space H?(g). There 
are other ways though to obtain equivalent extensions — for example, the central term 
can be rescaled — so H*(g) is in general too fine to serve as a ‘moduli space’ of one- 
dimensional central extensions of g, but it gives a very useful partial answer. For example, 
H?(qg) is trivial for any finite-dimensional semi-simple Lie algebra g, which means any 
such g has only trivial central extensions (see Question 3.1.4). 

For a concrete example, consider the n-dimensional abelian Lie algebra h = C”, with 
basis {e1,..., €n}. A one-dimensional central extension A of h is uniquely determined 
by n? numbers a; j € C defined by [e;e;] = aj;C, where C € A is central (all other 
brackets of / are determined by bilinearity and [e;C] = 0). Anti-commutativity requires 
;; = —Q ji, and anti-associativity is automatically satisfied. Thus each choice of an anti- 
symmetric n x n matrix A = (q@;;) defines a one-dimensional central extension ha of 
b = C”, and conversely. The dependence of this argument on an arbitrary choice of basis 
e; means there is redundancy here: in particular, two such central extensions ba and hp’ 
define isomorphic Lie algebras iff there is an invertible matrix B such that A’ = BAB’. 


The reader can verify that any anti-symmetric matrix A is equivalent in this sense to 
0 1 

—1 0 

rank of A. Thus we get a different one-dimensional central extension of C”, for each 
k=0,1,..., Ln/2]. When A is invertible (i.e. k = n/2), we call h a Heisenberg algebra; 
as simple a (non-simple!) Lie algebra as it is, it’s one of the most important. 


the direct sum of k copies of , and £ = n — 2k copies of (0), where 2k is the 


3.1.2 The Virasoro algebra 


Recall the Witt algebra Witt in (1.4.9). For each choice of a, B € C, we get a module 
Vu,p, with basis vz, k € Z, given by 


Lavt = —(k +a + B+ BN) Vran. (3.1.4a) 


This can be obtained from the derived module (Section 1.5.5) coming from the natural 
action of a subgroup of the diffeomorphism group Diff(S') on the space of differential 
‘forms’ p(z) z“ (dz)?, where p(z) € C[z*!] are Laurent polynomials. Clearly, Vy+m, p= 
Va, for any m € Z. 

As usual, we are interested in unitary modules (Section 1.5.1), and for this we need 


an anti-homomorphism w of Witt. Up to an automorphism of Witt, the unique choice 
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is œl, = €_,. Then for this choice, Vy g is unitary iff both Re(B) = 1/2 anda + 8 € R 
[334]. These modules are also irreducible. 

The element £9 € Witt is obviously special and plays the role of energy operator 
(Hamiltonian) in the application to physics. The most interesting Witt-modules are 
unitary ones with diagonalisable £ọ. In this case the eigenvalues of £9 will necessarily 
be real, and should have the physical interpretation of energy. Unfortunately, the only 
nontrivial unitary irreducible Witt-modules with £o diagonalisable are those Vag. This is 
unfortunate because the eigenvalues of £ọ in any Vag have no upper or lower bound. For 
reasons of stability, physics wants energy to be bounded below. The space Vag is infinite- 
dimensional, but £o defines on it a natural grading into finite-dimensional subspaces, and 
so we are led to formally define its graded-dimension to be 


tty,,q° = ge, (3.1.4b) 
keZ 
Unfortunately this never converges. 
Central extensions are a common theme in infinite-dimensional Lie theory.’ Their 
raison d être is always the same: a richer supply of representations. The Virasoro algebra 
Yir is the one-dimensional central extension Vir = Witt H CC with brackets 


2 
—1 
[Deb = r= Wie Ta C, (3.1.5a) 


[LC] = 0. (3.1.5b) 


As always, we avoid convergence issues by defining Yir to consist of only finite linear 
combinations of these basis vectors. Incidentally, a common mistake in the physics liter- 
ature is to regard C as a number: it is in fact a vector, though in most modules of interest 
to, for example, mathematical physics it is mapped to a scalar multiple c7 of the identity. 

The reason for the strange-looking (3.1.5a) is that we have little choice: Wit is the 
unique nontrivial one-dimensional central extension of Witt (Question 3.1.5). The factor 
b there is conventional, but arises naturally in the realisations of Yir by normal-ordered 
operators in Fock spaces (see (3.2.13), (3.2.14) for such a calculation). In fact, the normal- 
ordering prescription is somewhat arbitrary and actually we are much more interested 
in a slightly different basis of Yir, with Lo replaced by Lo — C /24. This is the com- 
bination appearing in almost every expression for characters from this point on. Where 
does this —C /24 come from? With this modified Lo, the brackets (3.1.5a) simplify 
(Question 3.1.8). According to conformal field theory or vertex operator algebras, this 
new basis corresponds to a change in topology (see Section 5.3.4), which can be calcu- 
lated using the Atiyah—Singer Index Theorem [8], so physically the ‘conformal anomaly’ 
term —c/24 is a Casimir effect. But the best algebraic explanation for this —c/24 is given 
Section 3.2.3. 

As before, Lo € Mit is the energy operator, and so we want irreducible Vir-modules 
where Lo is diagonalisable and its eigenvalues are bounded below. Let v be any eigenvec- 
tor of Lo in such a module, say Lov = Ev, and suppose L,v Æ 0 for some n > 0. Then 


1 On the other hand, the finite-dimensional simple Lie algebras do not have nontrivial central extensions. 
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Lo(L,v) = —nL,v + L,Lov = (E — n)L,v and thus (L,,)‘v will be an eigenvector of 
Lo whose eigenvalue E — n£ has real part going to —oo as £ > oo. Thus any Yir- 
module whose Lo-eigenvalues have real part bounded below must be a highest-weight 
module. 

More precisely, because Wir has a triangular decomposition (recall (1.5.5d)) 


Vir_ @ Vito Ð Vir, = span{Ln}n<0 ® span{Lo, C} ® span{Ly}ns0, 


we can mimic the construction of highest-weight modules in Section 1.5.3. In particular, 
for any h, c € C, the Verma module M (c, h) is the universal Vit-module generated by a 
vector v Æ 0 obeying 


Lov = hv, Cv = cv, Lv = 0, Vn > 0. 


The pair (c, h) is the highest weight; c is the central charge and h the conformal weight. 
As before, it can be more explicitly defined using the universal enveloping algebra, or 
equivalently by inducing the module from Vito Yir, to all of Vir. By the Poincaré— 
Birkhoff—Witt Theorem 1.5.2, M(c, h) has a basis given by all vectors 


LiL- P -Li U, 


for all integers i; > i2 >--- >i, > 1. Any other Yir-module with highest weight (c, h) 
is a homomorphic image of M (c, h), or equivalently the quotient of M (c, h) by some 
ideal. 

Each Verma module M (c, h) is indecomposable, but may not be irreducible. However, 
they all have a unique nontrivial irreducible quotient V (c, h), which is then the unique 
irreducible Yir-module with highest weight (c, A). 

The anti-linear anti-homomorphism (‘adjoint’) of Yir sends L, to L_,, and fixes 
C. The only unitary irreducible Yir-modules where Lo is diagonalisable and all its 
eigenspaces are finite-dimensional are certain Va g in (3.1.4a) (these are Vir-modules 
with C acting trivially), as well as certain highest-weight modules V (c, h) and their 
duals, the lowest-weight modules V (c, h)*. In fact, V(c, h) (and V (c, h)*) are unitary 
iff either: (i) both c > 1 and h > O; or (ii) c and A fall into the discrete series, i.e. for 
m,r,s €Nwithl<s<r<m-+l1, 


6 _ (m+3)r —(m + 2)sy = 1 


== 1 h=hinys = 
, 4(m + 2)(m + 3) 


(m+ 2)(m + 3)’ eh) 


These V(c, h) are called positive-energy representations since the spectrum of Lo is 
positive. Thus the only unitary irreducible Wit-modules with Lo diagonalisable, with 
finite-dimensional Lo-eigenspaces, and with the Lo-spectrum bounded below, are the 
V (c, A) in (i) and (ii). They are the building blocks of the most interesting affine algebra 
representations, vertex operator algebra modules and conformal field theories. 

For unitary V (c,h), we have V(c,h) = M(c,h) when both c > 1 and h > 0, or 
when c = 1 and 2h ¢ Z. In these cases, by analogy with (3.1.4b), V (c,h) has 
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CO 
dimye nlg) = tema’? =a" | Ja-a@'y (3.1.7a) 


n=1 


as the infinite product gives the generating function for the partition numbers: 
CO CO 
| Ja -4 = 35 peng” (3.1.7b) 
n=1 m=1 


where p(m) is the number of ways to write m as a sum m =a, + a2 +---+ a, for posi- 
tive integers 1 < a; < a2 <--- < ag. Unlike (3.1.4b), this converges whenever |q| < 1. 
In fact, we recognise (3.1.7b) as (up to a factor of g!/*4) the reciprocal of the Dedekind 
eta n(T) (2.2.6b), once we change variables by q = e27it _ we saw last chapter that n(t) 
is a modular form for SL2(Z). In fact we obtain 


Pee) ei h- 


dimy (e,n) (e ®~dimy em (C77), (3.1.7c) 


> [0,6] 
dimye. me = + i exp[27ihh'] dimy(ee7"" dh’.  (3.1.7d) 
T J—oo 


This is our first glimpse of modularity from a graded dimension, though it certainly won’t 
be our last. But 7(t) arises here through elementary combinatorics, so it is tempting to 
dismiss this modularity as accidental. This however would be an error. 

What should be the characters of these Uir-modules? For simple Lie algebras, we 
define the character as a trace over formal exponentials of elements of the Cartan subal- 
gebra. The analogue of the Cartan subalgebra here is Yitro = CLy ® C C, so the character 
of V(c, h) should be 


che a(ZL, zc) = tiveness, (3.1.8) 


which equals e77'¢2c times the graded-dimension of V (c, h) (with g = eiL), 
The characters of the discrete series (3.1.6) are calculated in [477], and again converge 


for |e?"#-| < 1. Moreover, they obey a much more interesting modularity than do the 
b 
graded-dimensions in (3.1.7): let a) € SL2(Z) act on Yitro by 
b t+d- —b 
CL, zc) > (E eae ) (3.1.9) 
fzı+d 24 (fz +d) 


then Chen,hm,s (ZL ZC) 18 fixed by some T(N) (recall (2.2.4a)), and for each fixed m 
(i.e. fixed central charge c), the span over all 1 < s <r <m + 1 of the characters 
Chen,hn,, 48 invariant under SL2(Z). They furnish a good example of modular data 
(Definition 6.1.6). This SL2(Z) action (3.1.9) is a little complicated; if instead we spe- 
cialise to the variables z; = t and zc = — t /24, then each 


hyd (T) = thee. tHe tne (3.1.10) 


is amodular function for some F(N) fort € H, and for fixed m the characters ch,,,-),,,.,,(T) 
form a vector-valued modular function for SL2(Z) (Definition 2.2.2). 
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The best explanation for the mysterious-looking discrete series (3.1.6) will probably 
come from the orbit method [563], but the analysis is still incomplete. At least part of 
the discrete series of the Virasoro algebra has been related to (co)homology theory of 
the universal cover of SL2(R), given a discrete topology [164]. This should be explored 
further. 

The characters of the non-unitary V (c, h), for c,h € R, have most of the properties 
of those of the unitary ones, and it is unfair to completely ignore them. For example, 
for c, h € R the modules V (c, h) have a contravariant nondegenerate Hermitian form 
(x, *), apart from the positive-definiteness condition. Lie algebras typically have too 
many representations and some criterion is needed that isolates the interesting ones, but 
unitarity is too restrictive here. 

As we know, the Lie algebra Vect(S') of vector fields on the circle contains the real Witt 
algebra Wittp (i.e. the span over R of the generators £, in (1.4.9)) as a dense ‘Laurent 
polynomial’ subalgebra. The connected real Lie group naturally associated with Vect(S!) 
is the group Diff*(S') of orientation-preserving diffeomorphisms S! —> S! of the circle. 
As a group, Difft(S') is simple [286] but as a manifold it is not simply connected: its 
universal cover Diff(s ') is the group of all diffeomorphisms ¢ : R —> R of the real line 
satisfying the periodicity condition @(x + 27) = d(x) + 27 . The centre of the universal 
cover is Z (namely ġ„(x) = x + 2xn) and Diff(S!)/Z & Diff(S!). 

Nontrivial central extensions of Difft(S') by a circle are explicitly constructed in, for 
example, section 6.8 of [465] and appendix D.5 of [295]; these all have a Lie algebra 
isomorphic to the real Virasoro algebra Yirg (i.e. the R-span of the generators Lm, C of 
(3.1.5)). 

Lie theory for the Virasoro and Witt algebras (and more generally the Lie algebra 
Vect(M) of vector fields on any manifold M) is much more complicated than the finite- 
dimensional semi-simple theory described in Chapter 1. For example, although the ‘expo- 
nential’ map exp: Vect(S!) > Difft(S') is defined here (by first integrating the vector 
field to its flow), it is neither locally one-to-one nor locally onto (proposition 3.3.1 of 
[465]). By comparison, the exponential map of compact Lie groups is locally one-to-one 
and globally onto. Moreover, the complex Lie algebra C @ Vect(S!) does not have a 
corresponding Lie group. After all, although a vector field on S! corresponds to a path 
in the space of maps (in fact diffeomorphisms) S! —> S!, and these form a group by 
composition, a complex vector field on S! corresponds to a path in the space of maps 
S! — Cand these won’t form a group. Segal [502] suggests that the complex Lie semi- 
group Co. defined in Section 4.4.1 is the closest we can come to the complexification 
of Diff*(S!). 

We have two fairly general frameworks in which to understand Lie group representa- 
tions: Borel—Weil and the orbit method (a.k.a. geometric quantisation). There is, as we 
recall from Section 1.5.5, a general philosophy that says the representations of a group 
G (here Diff(S')) are in one-to-one correspondence with certain orbits of the coadjoint 
action of G on the Lie algebra g of G (here Witt). As mentioned earlier, Witten [563] 
explored this possible relation for the Virasoro algebra. For example, the homogeneous 
space Diff(S!)/S! appears as an orbit, and can be associated with ghosts in string theory. 


Modularity from the circle 185 


The main motivation would be to find a new interpretation for the discrete series (3.1.6), 
which is a little mysterious from the algebraic point of view. Witten identified the orbits 
to which these should correspond, but couldn’t quantise those orbits (this is a common 
curse of the orbit method). 

The space Diff(S!)/PSL2(R) is also a coadjoint orbit. Something special happens 
here when we replace Diff(S!) with the larger group QS(S!) of quasi-symmetric home- 
omorphisms of S1: then QS(S!)/PSL(R) is called the universal Teichmüller space S. 
Every Teichmüller space Tg, n (recall Section 2.1.4) is naturally contained in T. Like- 
wise, Diff(S!)/PSL(R) naturally embeds in T (every diffeomorphism of S! is quasi- 
symmetric), and intersects each T, „ transversely. See the reviews [460], [168] for def- 
initions and references. Given this, an intriguing answer to the challenge suggested 
by Manin in Section 5.4.1 is to consider the reparametrisations of strings using quasi- 
symmetric homeomorphisms rather than diffeomorphisms; see [460] for some physical 
speculations. 

Pursuing an analogue of Borel—Weil is at least as interesting. Recall that for G compact, 
we get an action of G on line bundles on the flag manifold Gc/B, and this accounts 
for the special (i.e. finite-dimensional) representations of G. Manin [402] suggested that 
something similar happens to Vir, with now the moduli spaces of curves playing the 
role of the flag manifold. This thought was made much more precise in [357], [49], [13]. 
Consider the enhanced moduli space Men of Section 2.1.4, where each of the n marked 
points on the genus-g surface is given a local coordinate z;. A copy of Witt for each 
marked point acts naturally on Mon: the vector field zfa /0z;, for £ > 1, changes the 
coordinate z;; 0/0z; moves the ith point; and finally za /0z; for £ < —1 can change the 
conformal structure of the surface. This action fills out the tangent space to any point on 
Mens i.e. we get a surjective Lie algebra homomorphism from Witt to the tangent space 
at any point on Mens and from this we can derive the central extension geometrically 
by considering determinant line bundles (a nice introduction to this important object is 
[192]) over My n. 

Pushing this much further would force us into the complexities (and riches) of alge- 
braic geometry and D-modules (see [116] for a gentle introduction to the simplest D- 
modules). A far-reaching generalisation of the Borel—Weil Theorem is the equivalence 
of categories established by Beilinson—Bernstein and Brylinski—Kashiwara: given a Lie 
group G with semi-simple Lie algebra g, their ‘localisation functor’ relates an algebraic 
category, whose objects include the Verma modules of g, with a topological category of 
D-modules (i.e. sheaves of modules over a ring of differential operators over the flag 
manifold Gc/B). Describing this would take us far afield (see [80], [417] for reviews 
and references). In conformal field theory, the Virasoro algebra, moduli spaces Dyn, 
and mapping class groups I’,,, take the place of g, Gc/B and the Weyl group [402], 
[530]. [49] relates Virasoro modules to D-modules on the enhanced moduli space Mon. 

In any case, this deep relation between moduli spaces of curves and it is significant 
to Moonshine, because of its relation to the analogues of the Knizhnik-Zamolodchikov 
(KZ) equations in any conformal field theory at any genus. We elaborate on this elsewhere 
(starting in Section 3.2.4), but for now let us say that ‘chiral blocks’ are sections over 
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the moduli spaces Mons and satisfy a system of partial differential equations saying 
roughly that they respect this Wir action. The monodromy of those equations gives 
rise to projective actions of the mapping class groups on the spaces of chiral blocks. 
Now, the chiral blocks of the space Mı ı (or rather M11) are vertex operator algebra 
characters (including for instance (3.1.10)), and T1,ı = SL2(Z) (or rather its central 
extension Pia = B3) acts on them. This is conformal field theory’s explanation for the 
modularity of these characters. Thus the Virasoro algebra, through its action on the Mon : 
lies at the heart of Moonshine. 


Question 3.1.1. (a) Let G = Z x Z2. Define a map P : G > GL,(C) by 


1 1 
P00.) = (4 i) raoi a 


POD =(4 A aiei a) 


Verify that P is a projective representation of G. 
(b) Let Q be the order 8 ‘quaternion group’, given by the following relations: 


Q = {+1, +i, +j, +k | — 1 = (+i? = (+)? = (4k”" ij =k = —ji, —1 is in centre}. 
}. 


— 


Show that there is a homomorphism g : Q —> G with kernel {+ 
(c) Show that there is a true representation R of Q such that 


P(x) = ôx) R(r(x), Vx €G, 
where r(x) € g7! (x), and where ô : G > C*. 


Question 3.1.2. Identify G = S! with R/Z, and for any class [x] € R/Z, choose the 
unique representative 0 < x < 1. Verify that for any complex number «œ, the map [x] => 
a* defines a one-dimensional projective representation of S!. Find the corresponding 
true representation on the universal cover G of S!. 


Question 3.1.3. For this question, let G be the additive group C?. Define the function 
aœ : G x G —> C* by a(z, w) = exp[zow) — zıw2]. Verify that a obeys the 2-cocycle 
condition (3.1.1b), and construct the corresponding central extension. 


Question 3.1.4. Find all one-dimensional central extensions of the Lie algebra Aj. 


Question 3.1.5. Show that there are only two one-dimensional central extensions of the 
Witt algebra, up to isomorphism. (Hint: first show, changing basis if necessary, that 
[Lo, La] = —nL,,. Then consider anti-associativity of [Lo[LnLn]].) 


Question 3.1.6. (a) The group PSL2(R) acts naturally on the unit disc |z| < 1 by Möbius 
transformations. Use this to embed PSL2(R) naturally in Difft(S!), and find the corre- 
sponding Lie subalgebra of Vect(S'). 

(b) The group SL? (R) naturally acts on the space of semi-infinite rays R> (x, y) in R? with 
endpoint at the origin (0, 0). Find this action, and use it to embed SL2(R) in Difft(S'). 
Find the corresponding Lie subalgebra of Vect(S'). 
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Question 3.1.7. Prove that the Lie algebra of derivations of the algebra C[x+!] of Laurent 
polynomials is Vect(S'). 


Question 3.1.8. Find the constant œ € C for which the new basis L’, = La + a ôn,oC of 
Yir has especially simple brackets [L’,, Li]. 


m?’ 


3.2 Affine algebras and their representations 


The theory of nontwisted affine Kac—Moody algebras (usually called affine algebras) is 
very analogous to that of the finite-dimensional simple Lie algebras. Nothing infinite- 
dimensional tries harder to be finite-dimensional than affine algebras. Their construction 
is so trivial that it seems surprising anything interesting and new can happen here. But 
a certain ‘miracle’ happens. . . 

Standard references for the theory of affine algebras are [328], [337], [214], [551]. 
We will ignore here an interesting part of the story: the KP hierarchy [423]. 


3.2.1 Motivation 


Generalisations are too easy; they should be justified before they are endured. Here we 
describe the original justifications for the study of Kac—-Moody algebras. 

Each simple finite-dimensional Lie algebra has, as we know, a Weyl group, which is 
a symmetry of most of the data of the algebra (e.g. the weight multiplicities of finite- 
dimensional modules) and which encodes much (but not all) of the structure of the 
algebra. These Weyl groups are a very special sort of group: they are generated by 
reflections (namely those through the simple roots). 

Associated with any vector œ € R”, the reflection rą through a, sending a to —a@ and 
fixing the hyperplane perpendicular to a is given by (1.5.5c). More abstractly, a reflection 
r is simply an involution (i.e. order 2: r? = e). A finite reflection group is a finite group 
generated by reflections. Coxeter studied these as symmetries of a regular solids. 

For example, the dihedral group D, (the group of symmetries of a regular n-gon) is 
a finite reflection group, consisting of n reflections and n rotations, and is generated by 
any two neighbouring reflections. The symmetric group S; is a finite reflection group: it 
acts on an orthonormal basis e; of R” by permuting the subscripts, and is generated by 
the transpositions (i, 7 + 1), which are reflections rg, through the vector e; — ei+1. 

Finite reflection groups have remarkably simple presentations. 


Definition 3.2.1 A Coxeter group G is a group with a set R of generators, whose 
complete list of relations is 


Ery =e, Wryr' ER, 


where m(r,r) = Land the other m(r,r’) all lie in {2, 3, ..., oo}. (The value m(r, r’) = œ 
means that rr’ has infinite order.) 


The geometry of Coxeter groups is quite pretty — see, for example, [301], [84]. In Sec- 
tion 7.1.1 we describe a generalisation due to Conway, and its relation to the Monster M. 
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Fig. 3.1 The indecomposable finite Coxeter groups. 


The list of finite Coxeter groups and finite reflection groups coincide. They are most 
easily described by the associated Coxeter graph: put a node for each generator r € R, 
and connect two nodes with an edge labelled m(r, r’). To increase readability, erase the 
edge and label if m(r, r’) = 2, and erase the label (but keep the edge) if m(r, r’) = 3. The 
complete list of finite Coxeter groups (Coxeter, 1935) is given by arbitrary disjoint unions 
of the graphs of Figure 3.1. The group given by A, is the symmetric group S,,41, and 
I(n) is the dihedral group D,,. The group H3 is the symmetry group of the icosahedron, 
and is isomorphic to Z) x As. 

Figure 3.1 should remind us of Figure 1.17. Indeed, Figure 3.1 includes the Weyl 
groups of all simple finite-dimensional Lie algebras. More precisely, the Weyl groups 
consist of all finite Coxeter groups that obey the crystallographic condition: for all 
distinctr,r’ € R,m(r,r’) € {1, 2, 3, 4, 6}. Geometrically, the crystallographic condition 
says that the Coxeter group stabilises a lattice in R” (see also Question 1.7.6). As we 
recall, the Weyl groups stabilise the corresponding root lattice. 

Most Coxeter groups are infinite. As a graduate student, Robert Moody asked that, 
since the finite-dimensional semi-simple Lie algebras correspond to finite crystallo- 
graphic Coxeter groups, what is the class of Lie algebras that correspond more generally 
to any Coxeter group? Presumably they should have a theory very similar to that of 
the semi-simple ones. The partial answer to Moody’s beautiful question is that the Lie 
algebras corresponding to the (possibly infinite) crystallographic Coxeter groups are 
the Kac—Moody algebras! In fact, much of the interest in the affine algebras is due ulti- 
mately to their Weyl groups. We still don’t know the Lie algebras corresponding to the 
noncrystallographic groups. 

Victor Kac’s road to these algebras was quite different. Let g be a complex Lie algebra. 
By a Z-grading we mean that we can write the vector space g as g = %2 _ 9n, Such 
that [gm, gn] E gm+n for all m,n € Z. We call g a simple Z-graded Lie algebra if, in 
addition, g does not contain any nontrivial Z-graded ideal. 

It is probably hopeless to classify all simple Z-graded Lie algebras — there are too many 
of them. However, decades earlier, Cartan had studied vector fields (i.e. derivations) on 
polynomial algebras, and found four infinite families that were simple Z-graded, with 
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the dimension dim(g,,) bounded above by some polynomial in n. We say that these Z- 
graded algebras have polynomial growth. Kac conjectured, and Olivier Mathieu proved, 
the complete list of such algebras. 


Theorem 3.2.2 [409] The simple Z-graded Lie algebras of polynomial growth are: 
(a) the finite-dimensional simple Lie algebras; 

(b) the loop algebras (possibly twisted); 

(c) Cartan’s four families; and 

(d) the Witt algebra Witt. 


The proof is long and complicated. We’ve already met the finite-dimensional g and 
the Witt algebra. Cartan’s algebras are defined explicitly in, for example, [409]. The 
‘loop algebras’ are constructed next subsection (there are six infinite families and seven 
exceptionals). 

What we call the affine algebras — our main interest this chapter — are the central 
extensions of these loop algebras. Of course, such algebras cannot be simple because 
of their centres, and for this reason aren’t in Mathieu’s list. In any case, the affine 
algebras (together with the Virasoro algebra) answer a technical but natural algebraic 
question. 

A couple of years after their mathematical introduction [325, 430], the nontwisted 
affine algebras were discovered independently in string theory [42], under the name 
current algebras. 

The Lie algebras (a)—(d) in Mathieu’s list are truly extraordinary, especially regarding 
their representation theory. The simplest of Cartan’s families are the Wey] algebras, which 
are the differential operators on the algebra C[x),...,x,] of polynomials, generated 
by multiplication operators x1, ..., x, and partial derivatives 0/0x1,..., 0/0X,. Their 
modules are the simplest D-modules and have deep connections throughout mathematics 
and physics (see [116], [80] for an introduction). 


3.2.2 Construction and structure 


Let g be any simple finite-dimensional Lie algebra. The affine algebra g = g” is 
essentially the (polynomial) loop algebra L pory§ = C[t*'] Q g, defined to be all pos- 
sible ‘Laurent polynomials’ $ „<z ant” where each a, € g and all but finitely many 
an = 0. Treat t here as a formal variable. The bracket in Lp /yg is the obvious one: e.g. 
[at”, bt] = [ab]t"*”". Geometrically, £poiyg is the Lie algebra of polynomial maps 
S! — y (to see this realisation, think of t = e?”!”), This explains the name, and also sug- 
gests several generalisations (e.g. take any manifold in place of S1). But the loop algebra 
is simplest and best understood of these geometric Lie algebras, and the only one we 
consider in any depth (but see Section 3.3). Note that Lp iyg is infinite-dimensional. 
Its Lie groups are the loop groups, consisting of all maps of S! to a Lie group for g 
(Section 3.2.6). 

We saw S! before, in the discussion of the Witt algebra, so we may expect the Virasoro 
and affine algebras to be related. In fact, the Witt algebra acts on the affine algebras 
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as derivations. By definition, a derivation D is a linear map that obeys the product 
rule for derivatives: D([xy]) = [(Dx)y] + [x(Dy)]. The easiest examples are the “inner 
derivations’: D = ad(x). All derivations of g are inner, but the loop algebra £ pory g has 
several non-inner ones. In particular, because £ pory g consists of all (polynomial) maps 
Sis g, the vector fields Vectpgiy(S 1), and hence the Witt algebra Witt, act on it. More 
precisely, using the realisation £; = —t/+!d/dt of the basis vectors of (1.4.9), we get 
the action 


zad . 
€; at") = th at") = —nat!*", (3.2.1) 


This relation between Witt and Lporyg plays an important role in the whole theory. 


The loop algebra has a unique nontrivial one-dimensional central extension £ pory 9 = 
Lpolyg D CC, defined by 


(ex, ey] = tH Lx, y] + mbm -n KXLY) C (3.2.2a) 


for all x, y € gand m, n € Z, where «(x|y) is the invariant bilinear form (Killing form) 
of g. Thus Lpd has the same relation to Ly iyg that Yir has to Witt. Incidentally, 
[344] relates the central extensions (3.2.2a) and (3.1.5) to logarithms of differential 
operators. 

In addition, for a technical reason (namely, to make the simple roots linearly indepen- 
dent, so weight spaces can be finite-dimensional), a further noncentral one-dimensional 
extension is usually made. The result: by the affine algebra g = g” we mean the exten- 
sion of Lid by the derivation £o := tf. The Witt algebra also acts naturally on g 
(Question 3.2.3). The superscript ‘(1)’ denotes the fact that the loop algebra was twisted 
by an order-1 automorphism, in other words that it is nontwisted. It is called ‘affine’ 
because of its Weyl group, as we shall see. 

For example, elements in A, are triples (a(t), w, x) where w, x € C and a(t) = 


ned ant”, for all a, € sl(C) and only finitely many a, 4 t J The Lie bracket 


is 
[(a(t), w, x), (a(t), w, x] = 


(2 an, al] +x a nalt" = ye mant”, yim t(j) o) 3 (3.2.2b) 


m,n n m 


Each object associated with g has an analogue here: Coxeter-Dynkin diagram, Weyl 
group, weights,... For instance, the affine Coxeter-Dynkin diagram (Figure 3.2) is 
obtained from that of g (Figure 1.17) by adding one node, labelled with an ‘x’. We have 
included the labels a; and (where different from a;) colabels ay, whose significance is 
given next subsection. 

The Cartan subalgebra plays the same role here that it does in Chapter 1: decomposing 
modules into weight spaces. It can be chosen to be h $ CC @ Céo, where h is a Cartan 
subalgebra of the semi-simple algebra g. In fact, g has a triangular decomposition g = 
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Fig. 3.2 The nontwisted affine Coxeter-Dynkin diagrams. 


g+ ® b @ g- (recall (1.5.5d)) where 


g+ = (t= CH] 9 G © b)) @ CIt*'] @ ge (3.2.3a) 


and g = 9, ® 6 @§_ is a triangular decomposition of g. Given h, we obtain the root- 
space decomposition of g, as in (1.5.5a): 


s= A Dr ne Arh (3.2.3b) 


neZ ae neZ\0 


where g = b © @aGiz- We return to (3.2.3) when we study g-modules next subsection, 
but for now note that if g has rank r, then the root spaces t”h of g have dimension r 
while all t” gy have dimension 1. The latter, which act like root spaces in g, are called 
real, while the former are called imaginary. 

This loop algebra construction can be twisted. Let g again be any simple and finite- 
dimensional Lie algebra and let g be the corresponding affine algebra. Choose any 
symmetry a of the Coxeter-Dynkin diagram of g, of order N say, and extend this into 
an automorphism of g as in Section 1.5.4. We can further extend œ to an automorphism 
of g, by requiring « to fix C and £ọ, and send at” to a(a)éy"t". Then the fixed-point 
subalgebra go of g is 


go = Xant" + wC + xlo | an € Gn mod | : (3.2.4) 


where g; are the eigenspaces of «œ in g (recall (1.5.12)). This Lie algebra go is called 
a twisted affine algebra and is denoted g“). All twisted affine algebras are listed in 
Figure 3.3, with their colabels. Twisted affine algebras behave very analogously to the 
nontwisted ones, and also have a significant role in the theory (Section 3.4.1). 


192 Affine algebras and generalisations 


vo ~ v v 17 QM oN 2” 2” 

OD 2 2" 2 

y r >> eee —oz[o azo z>— e © o— <<0 

2) 
A vO 2 
ae Ann 
hd 2Y QV ov 1 1Y QV 3y 47 av 1y a” 3v 
azoo e e e —O 0O o; O-==D—o O—0#=0 
2) 
Dai E® DË 


Fig. 3.3 The twisted affine Coxeter-Dynkin diagrams. 


3.2.3 Representations 


The loop algebra £ pory g has no interesting modules, which is why we centrally extend it 
and introduce the affine algebras g = g”. No interesting g-module is finite-dimensional. 
However, g has a triangular decomposition (3.2.3a), so highest-weight modules exist. 
Weights à € * here are triples (A, k, u) € 5 x C?; a weight vector v obeys 


h.v = A(h)v, C.v = kv, lo.v = uv. 


Define the Verma module M(A,k,u) and the irreducible highest-weight module 
L(A, k, u) — our greatest interest — as in Section 1.5.3. Given any highest-weight module 
M, the central term C acts as a multiple k Z of the identity; this constant k is a funda- 
mental invariant of the representation called the /evel of M. On the other hand, the value 
of u is irrelevant (at least when the level is not 0) — see Question 3.2.4. 

A highest-weight module M is infinite-dimensional but comes with a grading M = 
Or. oMu+4n into eigenspaces of £9. Because £9 commutes with g, these spaces M „+n are 
all g-modules, and the lowest, namely M,,, has highest weight 4. Using this we can define 
the graded-dimension as in (3.1.4b). However, the £9-spaces of Verma modules will be 
infinite-dimensional, as will those of L(A, k, u) unless À € P(g). There are two ways 
to proceed: either find a more suitable grading, or (more important) consider instead the 
character. 

Defining these characters requires decomposing our modules into weight-spaces, and 
for this we should fix a basis for h*. A basis for § is hı, ..., A, (the usual basis for 
b) together with ho := C — )\'_, ah; and —£o (a are the colabels of Figure 3.2). 
The reason for introducing hg will be clearer in Section 3.3.1. The dual basis for h*, 
corresponding to ho, ..., h,, —£o, is written wo, ..., @,, 6. Recall from Sections 1.4.3 
and 1.5.2 the Killing form K(h|h ) and (AJT) for g; its analogue for affine algebras 
(Question 3.2.5) obeys 


K(Z + aly +uC|Z + a'to tu'C) = KZZ) — au’ — ua’, (3.2.5a) 
(Era +b8| So ujoj +as) = (Era | F wos) +) dai + bpi). 
i=0 j=0 i=l j=l i=0 


(3.2.5b) 


Affine algebras and their representations 193 


The level k is recovered from the weight à by the formula 


r 


k= (là) = Sa he (3.2.5c) 
i=0 


A useful formula gives the evaluation A(h): 


(x aes n) (Z+ Tlo +uC) = (£ na) @+ku—tb.  (3.2.5d) 


i=0 i=l 

In this notation, the roots of g are œ — (0 |æ)wo + nô for any root œ of g (these are 
the real roots, and have multiplicity 1), as well as nô (the imaginary roots, with multi- 
plicity equal to the rank r of g). The root 0 = }~;_, a;@; is called the highest root of §, 
where a; are the labels of g (Figure 3.2). The positive roots are any of these with n > 0, 
together with œ — (0 |œ )wo for positive roots w. The simple roots are œ; := @; — (0 |@;)wo 
for 1 <i < r, together with a := ô — }`;_; a;a;. Note that the adjoint representation 
of an affine algebra is not a highest-weight representation (why?). Many of these com- 
ments will make more sense when we associate a Coxeter-Dynkin diagram to g in 
Section 3.3.1. 

The weight-spaces for the Verma modules, and hence any highest-weight module M, 
are always finite-dimensional and so we can define their character chy as in (1.5.9a). 
For an easy example, the Verma module M(A, k, 0) = M(A) has character 


chuo h) = eO [| (1 — eye (3.2.6) 
a>0 

where ‘mult œ’ denotes the dimension of the root-space gy (which now may be > 1). We 
can obtain convergent graded dimensions by specialising this in any number of ways; 
the most obvious (called the principal gradation) chooses h € b so that e“ = x for all 
simple roots a; (0 <i < r), and eh) — 1 (x is a formal variable). In other words, the 
principal grading of a vector with weight à — )~;_) nja; is }-; n; less than the grading 
of à — this gradation keeps track of how many ‘creation operators’ f; (using notation 
introduced in Section 3.3.1) are applied to the ‘vacuum’ v in order to create the given state. 
For example, the affine algebra A, has positive roots 2m, — 2m) + nô (for n > 0) 
as well as mô and —2@, + 2@9 + mô (for m > 0). All root multiplicities are 1. The 
simple roots area; = 2w; — 2m and æo = ô — a. Ahighest-weight A looks like Agwp + 
A,@ , with level A9 + A,. Applying the principal gradation to the AP -Verma module, 

its character (3.2.6) specialises to the principally-graded dimension 


CO CO CO 
; —(142n)\-1 ee —~(—142m)\-1 
dime) =a" | [daar e) [dat] Gao) 
n=0 m=1 m=1 
Se Ory aa (3.2.7) 


where we write x = e~°"'* and recall the Dedekind eta function from (2.2.6b). Thus 
once again we find the remarkable fact that graded dimensions of Verma modules have 
something to do with the modular group SL2(Z) (compare (3.1.7)). Something similar 
happens for the highest-weight representations of any affine algebra! 
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Nothing particularly deep is happening here. The modularity of dim”! arises here 
for free, simply from the combinatorics. Indeed, for any affine algebra, the specialised 
product of (3.2.6) is the generating function for some partition-like function as in (3.1.7b), 
and these have nice modular behaviour (by arguments like those used in Section 2.2.2). 

More precisely, in the Verma module we get a free action of the creation operators of 
a Heisenberg subalgebra, coming from the central extension of the loop algebra of the 
Cartan subalgebra b. Thus the modular group arises in affine algebra characters because 
of a Heisenberg algebra action. However, much as the discrete series (3.1.6) of Yir- 
modules behaves simpler than the other unitary Uir-modules, discretising the integral 
in (3.1.7d), certain families of g-modules have especially nice modular properties. What 
makes this work is the Weyl group. It is this conjunction of the Heisenberg subalgebra 
with the affine Weyl group that makes affine algebras so special. 

The analogue for g of the finite-dimensional modules of g are called the integrable 
highest-weight modules. Technically speaking, an integrable representation m is one 
where allx € g+ are locally nilpotent, that is, foreach v € V there is a number 7,,(v) such 
that 2(x)"*™y = 0. In particular, this means e7° is well-defined as an operator on the 
module by its Taylor series — in infinite dimensions most operators can’t be exponentiated. 
These modules are called integrable because they are precisely those highest-weight 
modules that can be ‘integrated’ to a projective module of the corresponding loop group 
(Section 3.2.6). The integrable modules are precisely the unitary ones. 

The highest weight à = )~;_0 Ai@; is integrable iff each 4; € N. Hence the set of all 
integrable level k highest weights is 


i=0 i=0 


Pig = X hoi |à EN, k= Yan). (3.2.8) 


Simple formulae for the cardinality |PE(g)II exist for all algebras (Question 3.2.6) — 
for example, for A,“ it is | Pk || = (CE). The most important weight in Pk(g) is kwo, 
often denoted ‘0’ in the literature. The module L(kq@) has a vertex operator algebra 
structure (Section 5.2.2) and corresponds to the vacuum sector in conformal field theory 
(Section 6.1.1). 

The £9-eigenspaces of an integrable representation L(A) are all finite-dimensional 
representations of g, and thus we can define its character chy (,) as in (1.5.9a), although 
just as for the Virasoro algebra in (3.1.10) it proves to be more convenient to ‘normalise’ 
it: 


GU = EOI: S dim Ege, (3.2.9a) 
BEeQ(L(a)) 


where L(A) = @L(A)g is the weight-space decomposition of L(A), h € h, and 


ala +2 
yya E (3.2.9b) 
2(k + AY) 
k 
c = —dimg (3.2.9c) 


k+hY 
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Fig. 3.4 The Weyl group of A2” acting on level-2 weights. 


are called the conformal weight and central charge, respectively, of L(A). The quantity 
h“ = $ `;_oa* is called the dual Coxeter number and p = )~;_,@; the Weyl vector. 
The algebraic meaning of h, and c, involves the Virasoro algebra and is given shortly; 
6(h) plucks out the coefficient 27rit of £ọ (recall (3.2.5d)). We are assuming in (3.2.9b) 
that the highest-weight component u has been set to 0 (Question 3.2.4). We discuss 
the normalisation (the exponential involving h} — c}/24) later in this subsection. As in 
(1.5.11), the character x, can be written as an alternating sum over the Weyl group W, 
over a ‘nice’ denominator (namely the product in (3.2.6)). The difference is that W is 
now infinite. 

See Figure 3.4 for the Weyl group of A2") (projected to 5). and Question 3.2.7 for 
some simple calculations. Much of the interest in affine algebras can be traced to the 
‘miracle’ that their Weyl groups are a semi-direct product QY xW of translations in a 
lattice Q“ (the r-dimensional ‘co-root lattice’ of g) with the (finite) Weyl group W of 
g. More precisely, for any root & of g define the co-root &@ by &@ = 2c/(G|a); by the 
co-root lattice OY C 5 C b* of g we mean the Z-span of these co-roots. For any vector 
B € Q“, define the map 


tau) = u + (ulô) B — (Culb) + (BIB) (4 18)/2) 5, (3.2.10a) 


Vu € 6%. It is straightforward to verify tgt, = tg+y, and thus these deserve the name 
‘translations’. Any element of the Weyl group W of g can be written uniquely as a pair 
(tg, w) for some B € QY and some w € W, and 


(tg, w) o (tp, w) = (tg tip), Ww’). (3.2.10b) 


As in (1.5.6d), weights u € Q(L(A)) in the same Weyl orbit of an integrable module 
have the same multiplicities. One thing this implies is that x, will be of the form “theta 
series’/denominator. In particular, the lattice is Q“, and the ‘(6|6)é’ term in (3.2.10a) 
provides the quadratic form in the lattice theta series. As we know from (2.3.10), theta 
series are modular forms, and this is the second complementary reason the modular 
group SL2(Z) makes an appearance (the first was the combinatorics of the free action of 
the Heisenberg subalgebra of creation operators). To make this more precise, consider 
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the highest weight A = owo + Aya, € P*(A(”). Then 


(k+2) (k+2) 
Ou (zn u= Oa Z, u) 


OPC, z, u) — OP (T, z, u) 
O(t, z, u) = emi y exp[2mintl? — 2/2xrin£z]. (3.2.11b) 


leZ+ 5. 


x, (201i (z+ Tlo +uC)) = 


; (3.2.11a) 


Any g has an analogue of (3.2.11), the Weyl—Kac character formula 


(k+h“) 
wer e(w) Ona. +.)F z,u) 
hY 
Luew ew) Onc (T, z u) 


where both the numerator and denominator involve an alternating sum over the finite 
Weyl group W of g, and where the theta series in (3.2.11c) involves a sum over the 
lattice QY shifted by some weight and appropriately rescaled. For example, the Weyl 
group of A; is S2 and its co-root lattice QY is /2Z. The key variable in (3.2.11) is the 
modular one t — the main role of the other variables is to ensure linear independence. 
The character x, converges for any choice of t € H, z € C” and u € C. 

Thus the denominator of the character of an irreducible integrable g-module L(A) is 
a modular form, by virtue of the combinatorics of Verma modules. The numerator is a 
modular form, by virtue of the structure and action of the affine Weyl group. Together 
they give a modular function. 


x, (20i(z + Tlo +uC)) = (3.2.11c) 


Theorem 3.2.3 [333] Let g be finite-dimensional and simple, and let g = g™ be 
the corresponding affine algebra. Define x,(T, z, u) = x, (271 (z + Tlo + uC)). Fix any 
levelk € N. Then for any integrable weight à € P(g), X(T, 0, 0) is a modular function 
for some congruence subgroup T(N). Moreover, define a column vector x(t, z, u) with 
entries x,(t, Zz, u) for eachàÀ € P(g). Then there is a unitary representation p of SLa(Z) 
such that 


,fatt+b Z f (ziz) a b\ , 
i( Jao(t e 


frd ftt+d 2ft+d) fd 
a b 
for any e i € SL(Z). 


We say that the characters x, define a vector-valued Jacobi form for SL2(Z), with multi- 
plier p (recall Definition 2.2.2). This modularity of affine characters is fundamental to this 
book, and a prototypical example of much of what follows. The complex matrices p(A) 
here are examples of modular data (Sections 6.1.2 and 6.2.1). A T(N) that uniformly 
works in Theorem 3.2.3 is to let N be the least common multiple of all denominators of 
h, — c/24 (these will always be rational), as A runs through the finite set P K (g). 

We can now explain McKay’s observation (0.5.1) that the coefficients of j (t)3 
are related to the Eg Lie group. p(t) equals the character X% (T, 0,0) of the inte- 
grable E <?-module. The q-coefficients (q = e77'") of j 3 (t) are thus dimensions of the 
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£g-eigenspaces of L(wo), which are automatically Eg-modules. Because P EY) = 
{wo}, all modularity properties of the character j? are a direct consequence of Theorem 
3.2.3. 

All of this assumes the underlying finite-dimensional Lie algebra g is semi-simple. 
When it is merely reductive (i.e. the direct sum of copies of the one-dimensional abelian 
algebra u;, with a number of simple Lie algebras), something different happens. For 
example, consider the affinisation of u; (the oscillator algebra). Ithas basis C, an (n € Z) 
and obeys relations 


[C, an] = 90, Lam, An] = Mbm,—nC. (3.2.12a) 


Its irreducible unitary modules are parametrised by a highest weight A € R, and are 
Verma modules M (à). In particular, any A € R defines a different irreducible unitary 
module. They can be realised in the space of polynomials C[x1, x2, . . .] by the operators 
C.p(x) = p(x), ao.p(x) = A p(x), and for alln > 1 


? ne), dap) = n% p). (3.2.12b) 


OXn 


An P(X) = 


Note that the level k here is 1 (why can we demand k = 1?). The reader can verify that 
this representation has (normalised) character 


x(t) = q” nC). (3.2.12c) 


These characters aren’t linearly independent (since x- = x,), but the reader can work 
out the usual remedy. Their modularity is discussed in Section 6.2.2. In the language 
of conformal field theory, the unitary modules of the oscillator algebra u;® are quasi- 
rational while the integrable modules of affine algebras are rational. Nevertheless, the 
oscillator algebra (studied in detail in [334]) is a convenient toy model for the affine 
algebras. 

Last subsection we saw that Witt acts naturally on loop algebras by derivations. Does 
Mitt act on affine modules? Consider the oscillator algebra for simplicity. We will have 
a universal Witt action on u; -modules M if we can construct the basis £, of (1.4.9) out 
of the operators am of (3.2.12a), that is realise the £, in the universal enveloping algebra 
U(u,) (or some completion thereof). We are led to consider quadratic combinations 
in the am, since that is the simplest after linear ones (which won’t work), and also 
since £o has the interpretation of a Hamiltonian, which always contains a quadratic part. 
Define 


tm = Y eran St, (3.2.13a) 
ieZ 
Being an infinite sum, convergence won’t be automatic, but let’s ignore that for now. 
Then 


[tm, An] = > di lampiran] + [a An lami = —NAm4nC — NC aman = Mdm 
ieZ 
(3.2.13b) 
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In, for example, a highest-weight module, C acts as a scalar k, and so (at least for k ~ 0) 
£4 = itm mimics the action of the standard Witt action £m = —t”+!d/dt on the loop 
algebra £ poryt. This looks promising. We compute from (3.2.13b) 


[ems En] = (2k)? $ tm, @ilangs + (2k)? >) a jltm anj] 


ieZ jez 
= KTS Tidings + 2K)! $ (n = j)a-jamyntj = (M — nmn, 
ieZ jEZ 


establishing that indeed the @,, form a realisation of Witt in U (u, ®). 
Unfortunately, the sum in (3.2.13a) doesn’t converge. Take M to have highest-weight 
vector v with highest weight (A, k). Then 


tov = È (ajai —iC).v + aoao. + È a_jajv = k°v +k = i v, (3.2.13c) 
i<-l jz! jl 

which diverges. This means (3.2.13a) must be modified. The simplest correction can 

be written Tm := Diez :d_j4m4i :, Where the normal-ordering : aman :is defined to 

equal either @,@, Or aman, depending on whether or not m < n. For m 4 0, Tm = tn, 

but To.v = k?v. Indeed, each operator Tm will be defined on any Fock space. We find 

that 


(oe) 


Lm = OKY! ÑO aiam : (3.2.14a) 
i=—00 
satisfies both 
[Lm, an] = —Nadm+ns (3.2.14b) 
m—m 
[Lm, Ln) = (m — n)Lm+n + 1 Onan (3.2.14c) 


Thus any highest-weight u;“!’-module is simultaneously a Wit-module with central 
charge c = 1. Thus this nonzero central charge arises as an analytic effect. 

Using (3.2.12a), this normal-ordering (3.2.14) doesn’t change L, = £n, for n Æ 0, but 
shifts the divergent £9 by the infinite multiple On i ) of C. There is nothing particularly 
special about this normal-ordering; for example, for any fixed £ we could have replaced 
the condition ‘m < n’ with ‘m < n + £’, and nothing would have changed except Lo 
would have been shifted by some other multiple of C . This is a clue to understanding what 
is so special about the —c /24 shifts in, for example, (3.1.10) or (3.2.9a). The arbitrariness 
of the normal-ordering can be removed by reinterpreting (‘regularising’) the divergent 
term in (3.2.13c) as k¢(—1) (recall (2.3.1)). Equivalently, this amounts to replacing the 
normal-ordered Lo with Lo — C /24. This is the algebraic ‘explanation’ for the naturality 
of the shift, and hence the pervasive appearance of —c/24: simply put, algebra prefers 
Lo — C /24 over all other combinations Lo + aC (recall Question 3.1.8). It should thus 
not come as a complete surprise that so too does SL2(Z). Incidentally, this ‘24’, ¢(—1), 
the special dimensions 8 + 2 and 24 + 2 in string theory and the 24 of Section 2.5.1 are 
all directly related. 
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More generally, Bloch [63] considered other algebras of differential operators on S!. 
In particular, in place of £, = —¢”*!d/dt he considers 


€ = (-1)'t(ed/dty t" (td/daty +". 


He obtains a (projective) realisation of these 2“) by normal-ordering operators in a Fock 
(or highest-weight) module, exactly as we do here. The analogue of (m° — m)/12 in 
the bracket [L”), L®] is a polynomial of degree 2r + 2s + 3 in m. As before, we want 
to remove this arbitrary choice of normal-ordering. Naively dropping it introduces the 
divergence 17+! 4 277+! 4...., so as before replace it with the Riemann zeta value 
¢(—1 —2r), ie. replace LO with LS? + (—1)'¢(—1 — 2r)C /2. Then the polynomial 
in m becomes the monomial (r + s + 1)(r + s + D!m7"+?5+3 /(2(2r + 2s + 3)!). This 
appearance of ‘zeta function regularisation’ in algebra has been interpreted and gener- 
alised in the vertex operator algebra framework (see [375] for a review). 

Identical comments hold for affine algebras. Choose a basis x, of g, orthonormal with 
respect to the Killing form: «(x,|xp) = ĉap. Then for A € P£(g), (3.2.14) become 


1 . ; 
Ln = = (Ixa X(t” xa) 2, 3.2.15 
EW) oy ne ae) a 
[Lm xt] = —nxt™*", Vx 6G, (3.2.15b) 
m? — m 
[Lins Ly] = (m = n)L m+n ie) R Amn (3.2.15c) 


Thus the g-module L(A) is also automatically a completely reducible Yir-module. Each 
irreducible Wir-submodule has central charge c, and conformal weight h € h, +N 
(see (3.2.9)). In L(A), the Virasoro generator Lo and the derivation £ọ of g are related 
by Lo = hald + £o. Equation (3.2.15a), known as the Sugawara construction, should 
remind us of the quadratic Casimir Q := 5 a XaXa Of g, that is, the simplest nontrivial 
element in the centre of U (g); it acts on the irreducible g-module L(A) as multiplication 
by the scalar (A|A + 2p) (recall (3.2.9b)). The shift by the dual Coxeter number /Y in 
(3.2.15a) arises algebraically as the eigenvalue of Q in the adjoint representation of g; 
its physical significance is discussed in Section 6.2.1. 

The integrable modules of twisted affine algebras X,“Y? (recall Figure 3.3) behave 
similarly. As we know from (3.2.4), X,®™ is obtained from the nontwisted affine algebra 
g = X,™® and an order-N symmetry a of the Coxeter-Dynkin diagram of X,. The 
integrable highest-weight X -modules L(A) are parametrised by (r + 1)-tuples A € P% 
as in (3.2.8), where the co-labels a; are now given in Figure 3.3. These modules also 
have weight-space decompositions as in (1.5.6a) and characters x, as in (3.2.9a). Their 
characters are also modular (see theorem 13.9 of [328] for details). 


Theorem 3.2.4 [333] The characters x}, à € P(A?) form a vector-valued Jacobi 
function for SL2(Z), as in Theorem 3.2.3. For g = Any 1, Daas E? and D,®, 
respectively, define g = D, 41, Ay—1°, Es, Da® and N = 2, 2, 2, 3; then the char- 
acters Xy, À € P(g), form a vector-valued Jacobi function for To(N) (recall (2.2.4b)), 
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and for each À € P£(g), 
—-l z (ziz) T Z 
ATT” x SPAN ePf) Xw ie N’ u). 


3.2.4 Braided #3: braids and affine algebras 


According to conformal field theory, the modularity of, for example, affine algebra 
characters arises through the monodromy of a system of partial differential equations (the 
Knizhnik—Zamolodchikov equations for a torus with one puncture). In this subsection 
we anticipate this important idea by considering the simpler and better-known situation 
of a sphere. See also [355], [174]; the basic idea of differential equation monodromy is 
nicely described in [363]. 


Theorem 3.2.5 Consider a simply-connected open region D in C. Consider the dif- 
ferential equation 
dw 
dz2 
where P(z) and Q(z) are holomorphic in D. For any point zy € D, and any a, B € C, 
there is a unique function w(z), holomorphic in D, satisfying the initial conditions 


ap p(y 2 -+ Q(z)w =0, (3.2.16a) 


w(zo) = a, (3.2.16b) 
dw 
— (zo) = B. (3.2.16c) 
dz 


Hence the solutions w to (3.2.16a) form a two-dimensional space, parametrised by 
a, B € C. For a proof of this theorem, see, for example, chapter XII of [307]. 

What if D is not simply-connected? One way to proceed would be to make D simply- 
connected by cutting it. For example, if D is C with n points z1, .. . , Zn removed, then 
we can cut D along a non-self-intersecting polygonal path connecting z;,...,2Z, and 
oo, avoiding the point zo. Call D’ the resulting simply-connected subregion of D. Then 
a holomorphic function on D restricts to a holomorphic function on D’; however, most 
holomorphic functions on D’ won’t extend continuously to D. 

The other way to proceed is to consider the (simply-connected) universal cover x : 
D—>D (recall Section 2.1.2). We can then identify D with D/ G for some group G 
isomorphic to the fundamental group 2\(D); each y € G is an automorphism of D 
shuffling the points Z € 2~!(z) above each z € D. Functions h holomorphic on D lift to 
functions h o x holomorphic on D, although a typical function hon D won *t correspond 
to a well-defined function on D. However, 2~!(D’) C D consists of several connected 
open components, one for each y € 7\(D), and through this there is a many-to-one 
correspondence between the holomorphic functions on D’ and those on D. 

Let’s return to the situation of Theorem 3.2.5, except with D now being non-simply- 
connected (although still connected). Then there is a unique solution w to (3.2.16) in 
D'. Writing P =Pomand O = Q o x, and choosing any Zo € xT! (zo), we can lift the 
equations (3.2.16) to D and again we obtain a unique solution Ww, this time holomorphic 
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in D. The space of solutions w on D’, w on D, are both two-dimensional. But we get 
more: both spaces carry naturally an action of the fundamental group 7\(D), called 
the monodromy representation. More precisely, each automorphism y* € G = 2\(D) 
carries a solution w of (3.2.16a) to another solution Ù o y* — it preserves a, 8 but 
changes the choice 7) € 2~!(zo). It corresponds to an analytic continuation of w across 
the polygonal path cut out from D, along closed paths y corresponding to y*. 

A simple example should make this clear. Consider 


—— +77! =0,. (3.2.17a) 
z 


Here, D is the punctured plane C \ {0} so we can take D’ to be C with the negative 
real axis removed. The fundamental group 2\(D) is Z, and the universal cover D is 
the infinite spiral staircase. Two solutions to (3.2.17a) in D’ are w = log z and w = 1. 
Analytically extend w(z) = log z along the unit circle starting at zọ = 1 and running 
counterclockwise: as we cross the negative real axis continuity requires the value of w to 
be shifted by 277i from its previous ‘principal’ value. More generally, the path y* = n, 
winding n times around the origin, would pick up a monodromy of 27rin. On the other 
hand, the constant solution w(z) = 1 is of course unchanged under analytic continuation. 
In terms of our basis {log z, 1}, we thus obtain the monodromy representation 


1 2min 
ne (3 1 ) : (3.2.17b) 


We are interested here in a slightly more complicated situation than that of 
Theorem 3.2.5. Let g be any finite-dimensional semi-simple Lie algebra and choose 
n distinct points z;,..., Zn in C. Recall the space €, defined in (1.2.6). Choose a basis 
Xa of g, orthonormal with respect to the Killing form «. For each i, choose a finite- 
dimensional g-representation R;, acting on a space V;. Fix some complex number y + 0. 
By the Knizhnik—Zamolodchikov (or KZ) equations we mean 


et ERR y, 1<i<n, (3.2.18a) 
j#i a J 
where w : € > Vi @--- @ Vn, and where R;(x,), R j(%q) act on the ith, jth components 
of the multilinear form w. 

We recognise in (3.2.18a) the quadratic Casimir Q = )°, XaXa discussed after (3.2.15). 
Physically (i.e. in the context of conformal field theory), w is a chiral block on the sphere 
P!(C) with n + 1 distinct marked points (namely z;,..., Zn and z,+; = 00) for a Wess— 
Zumino—Witten model (Section 4.3.2). Geometrically (see e.g. [338]), 


1 dz; aaa dż; 
5d- PDI Ria) @ Ria) E (3.2.18b) 
iZj 
defines a connection (Section 1.2.2) on the trivial vector bundle €, x W, for W = 


Vi ®--- @V,,. An easy calculation verifies this connection is flat (i.e. has 0 curvature). 
The partial differential equations (3.2.18a) say that w is a horizontal or parallel section. 
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In other words, restricting to a simply-connected subregion €, of €,,, the unique solution 
w(z1,.--, Zn) to (3.2.18a) satisfying some initial condition w(z) = w is obtained 
geometrically by parallel-transporting the vector w© along any path y in ©, connecting 
2 to the desired point (z1, . . . , Zn). 

Our context here is thus analogous to that of Theorem 3.2.5: parallel transport plays 
the role of analytic continuation, and the flatness of €,, corresponds to the Monodromy 
Theorem of complex analysis (e.g. theorem 16.15 of [481]). The result is that the space 
of solutions to (3.2.18a) carries a representation of the fundamental group 7 (€,), ie. 
of the pure braid group P,,. We get an action of the full braid group through ‘half- 
monodromies’: a braid £ € &, will take a solution w of (3.2.18a) to a solution of (3.2.18a) 
with values in Vg; ®---@Vg,, where £ acts on the indices {1,...,} through the 
natural homomorphism ¢ : 6, — S, described in Section 1.1.4. In particular, if all V; 
are isomorphic, the space of solutions of (3.2.18a) will carry a representation of the full 
group Bn. 

The infinitely many irreducible finite-dimensional modules of a simple Lie algebra 
naturally span a symmetric monoidal category (recall Section 1.6.2 for definitions); its 
character ring is isomorphic to a polynomial ring inr variables, where r is the rank of the 
algebra. On the other hand, the finitely-many level-k irreducible integrable modules of a 
nontwisted affine algebra span a braided monoidal category (in fact ribbon and modular 
categories); the corresponding character ring is called a fusion ring and is described 
in Section 6.2.1. The key ingredient in this category — the braiding — comes from this 
braid group monodromy. In Section 6.2.2 we see that this braid group monodromy, 
and associated braided monoidal category, generalise to the modules of sufficiently nice 
vertex operator algebras, and this (or if you prefer, conformal field theories) serves as 
the natural context for the modularity in Moonshine. 

There are many other occurrences of the braid group in the mathematics and physics 
neighbouring Moonshine, and most of these are directly related to this KZ monodromy 
on a sphere. For example, the knot invariants arising from subfactors and quantum groups 
come from braid group representations, and Drinfel’d and Kohno have proved that these 
representations are the same ones coming from KZ monodromy. 

On the other hand, the relation of the braid group B3 to SL2(Z) and its modular 
functions, which we have seen already in Section 2.4.3 and which we argue later plays a 
fundamental role in Monstrous Moonshine, does not have a direct relation to this braid 
group monodromy. But we will see later that modularity too is due to monodromy of 
a system of partial differential equations — the analogue of these KZ equations for a 
once-punctured torus — defining a flat connection on the extended moduli space My. 
The solutions of these equations are spanned by the affine algebra characters (or more 
generally the vertex operator algebra one-point functions). The associated monodromy 
group is the mapping class group of My, 1, which is readily seen to be 63. 

Intriguingly, this means that we’ve come full circle. Poincaré’s 125-year-old path 
to modular functions (see [259] for a review) was differential equations of the form 
(3.2.16a). Let f(z), g(z) be a basis for the space of solutions, and write €(z) = f(z)/g(z). 


Note that the monodromy group acts on € by Möbius transformations: E > on. 
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Poincaré found that, at least in some cases, when we invert €(z) and write z as a function 
of £, then z will be a modular function for some discrete subgroup of SL (R), acting on 
Eec H. 

A simple example is Legendre’s equation 


dy 4 1 — 3k? dy y 
dk? k(1—k*)dk  1-k? 


This has the elliptic periods K (k) and K'(k) = K (k’) as solutions (recall Section 2.2.1). 
It is more convenient to change variables to z = k?, when this equation becomes 


=0. 


dw 1—2z dw w _ 
dz z(1—z) dz 4z(1—z) _ 


Then K’(z) = K(1 — z), since k? + k’? = 1. The domain D is the plane with z = 0 and 
z = 1 removed; its fundamental group xı is the free group F2 = (00, 01) generated 


0. (3.2.19a) 


by counter-clockwise loops og about z = k. It turns out that K (z) is holomorphic at 
z = 0, but K’(z) has a logarithmic singularity there: K’(z) + iK (z)log z is holomor- 
phic at z = 0. Thus as we go counter-clockwise in a small circle about z = 0, K (z) is 
unchanged but K’(z) becomes K’(z) — 2iK (z). Hence, as we go counter-clockwise in a 
small circle about z = 1, K’(z) is unchanged but K (z) becomes K (z) + 2iK’(z). Thus in 
terms of the basis {K (z), iK’(z)} of solutions to (3.2.19a), the monodromy representation 


becomes 
1 2 1 0 
a (4 ay ar (3 is (3.2.19b) 


For the details of this calculation, see chapter 14.5 of [486]. The image of (3.2.19b) 
is precisely the congruence subgroup (2), which indeed is isomorphic to Fp. Now, 
Poincaré would have us invert the function iK'(z)/K (z). That ratio turns out to always 
be in H, and so denote it r(z). Expressing z = k? as a function of t, we obtain 


02(t)* 
BT 
Indeed, we know from (2.3.8) that (3.2.19c) is invariant under T (2). 
It is remarkable to recover in this way the group T (2), its action on H and a mod- 
ular function for [T (2) (in fact, (2) is genus-0 and 6} / 04 generates all of its modular 
functions). There are many other examples of this kind, for example 


31 1 25 2 ae 
ENE (zZ-1)“w=0 


z(T) (3.2.19c) 


wl ta tul + ( 


yields in this way the j-function. See [516] for more on the deep relation between 
modular forms and hypergeometric functions. The relations between affine algebras, 
the KZ equation and hypergeometric functions is explored in [541]. The Riemann— 
Hilbert problem asks that all linear representations of mapping class groups arise as 
monodromies; see the appendix of [259] for a history of this problem and chapter VIII 
of [80] for the modern treatment and generalisation using D-modules. 
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Thus Poincaré, like conformal field theory over a century after him, finds it natural to 
interpret modularity using differential equation monodromy! 


3.2.5 Singularities and Lie algebras 


In this subsection we quickly review the geometry underlying the associations of sin- 
gularities to simple Lie algebras (duVal) and affine Lie algebras (McKay), which are 
described in Section 2.5.2. This is related to mirror symmetry and provides a new expla- 
nation for the modularity of affine algebra characters. 

Let T be a finite subgroup of SU2(C). Then the orbifold C?/T has a critical point at 


the fixed point (0, 0); the minimal resolution Xr = C?/T is a smooth noncompact real 
4-manifold with an ALE (‘asymptotically locally Euclidean’) hyper-Kahler structure. 
An ALE manifold is Riemannian, with a metric tending quickly to the Euclidean one as 
r — oo. Physically, they correspond to positive-definite self-dual solutions to Einstein’s 
gravitation equations in a vacuum (‘gravitational instantons’). Conversely, any ALE 
hyper-Kahler manifold is diffeomorphic to some Xr for a unique I’. The details are 
reviewed in [362]. 

Kronheimer—Nakajima [362] use the Atiyah—Singer Index Theorem to directly relate 
the duVal and McKay data associated with a simple singularity. Let X be an ALE hyper- 
Kahler manifold and F < SU2(C) the corresponding finite group. Then asymptotically 
at infinity, X is flat and in fact looks like R*/I. Given any vector bundle E over X, 
the fibre over oo defines a '-module R via monodromy. Kronheimer—Nakajima take E 
to be R R*, where R is the tautological vector bundle, because its index vanishes. 
Then the monodromy representation R decomposes as J`; p; ® Pr, where p; are the 
irreducible representations of I. The Index Theorem provides an expression for the 
numbers 


1 cha (y) chp: (y) 
ITI 44, 2- chy) 

fori, j = 0, 1,...,n, as an integral over X involving the intersection matrix, where p 
is the defining two-dimensional representation of F < SL2(C). From this they quickly 
establish the equivalence of duVal’s observation that the intersection matrix is the neg- 
ative of the n x n Cartan matrix, with McKay’s interpretation of the (n + 1) x (n + 1) 
Cartan matrix as coefficients of the product p ® p;i. 

The first direct relation between simple singularities and the Lie algebras A,, D,, E, 
was established by Brieskorn [86]. Let gr be the finite-dimensional simple Lie algebra 
associated with I’, and Gr the corresponding Lie group. Let W be its (finite) Weyl 
group, and choose any Cartan subalgebra h. Then Brieskorn obtained the singularity 
C?/T and its resolution by studying the map gr > 4/W, sending x = x, + Xn € Gp 
(this decomposition of x is just the Jordan canonical form [300]) to the orbit of the 
semi-simple part x, under the adjoint action of Gr — these orbits are parametrised by 
6/W (Section 1.5.2). 
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More relevant for us is Nakajima’s geometric realisation of affine algebras and their 
integrable representations (see e.g. his review [445]). Let EF be an anti-self-dual Yang— 
Mills instanton over X with gauge group U;(C). These bundles E are associated with 
three discrete invariants: the monodromy representation R as above; the first Chern class 
cı(E); and the instanton number ch2(E) € N. 

The monodromy R is a k-dimensional representation of r. Decompose R into irre- 
ducibles: R = Xo; ipi, where the multiplicities 4; € N. Then taking dimensions we 
obtain k = Mii aidi, where a; = dim p;. According to the McKay correspondence, a; 
are the labels of the corresponding nontwisted affine algebra gr, and so A = J; Aja; is 
a level-k integrable highest weight of gr. 

Nakajima proceeds to construct not only gr from the geometric data, but also the 
gr-module L(A). The singularity at (0,0) of C?/T resolves locally into n copies of 
the sphere P'(C). These give a basis of H(X, Z); Nakajima identifies them with the 
usual basis h; of a Cartan subalgebra of the finite-dimensional algebra gr and their 
intersection form with the Killing form. Thus the dual vectors cı(E) are weights. The 
number ch2(£) is identified with an eigenvalue of the derivation 6 = Lo. The other 
generators e;, f; of gr can be interpreted likewise. The moduli space M(k) of U;(C)- 
instantons on X has a finite-dimensional connected component M(k), u,n for every 
choice of monodromy À, cı = u and chz = n. The infinite-dimensional cohomology 
space H*(M(k)) carries a natural though reducible module of the affine algebra gr. 
However, the middle-dimensional cohomology 


Byn HUME) u,n), d = Tim MOn pn) (3.2.20) 
is isomorphic to L(A), with each summand being a weight-space (the middle-dimensional 
cohomology spaces are generally the most interesting — for example, the pairing defines 
a bilinear form, here the Killing form, on them). 

This construction generalises considerably [445]. It also has a natural interpretation in 
string theory. The Bogomol’ nyi-Prasad—Sommerfeld (‘BPS’) states generally form an 
algebra closely related to Borcherds—Kac—Moody algebras (Section 3.3.2) [276]. Inside 
this BPS algebra for the heterotic string on the torus T* is the associated affine algebra. 
This string theory is dual to that of a type IIA string on a K3 surface (Xr is essentially 
a noncompact K3), where Nakajima’s construction is very natural. So string theory 
interprets Nakajima’s cohomological construction of affine algebras as a manifestation 
of mirror symmetry [276]. In this context, Vafa—Witten suggested that the modularity of 
affine algebra characters may have to do with S-duality [540], an SL2(Z)-symmetry of 
the heterotic string. It seems unlikely though that this can account for the modularity in 
arbitrary RCFT. We revisit mirror symmetry [291] in Section 7.3.8. 

Physically, instantons are configurations for which the classical action (4.1.3) has a 
local minima. This means that in the corresponding quantum theory, we should per- 
turb about them just as we do about the vacua. See the review [159] on instantons in 
supersymmetric theories. It turns out that (not necessarily holomorphic) modular forms 
appear naturally in this context, with the modular group arising again through S-duality. 
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Recalling Section 2.4.3, we can ask: Can S-duality sometimes be extended naturally into 
a Bz symmetry? This may provide a universal simplification, for example, for fractional 
instantons. 


3.2.6 Loop groups 


This brief subsection introduces the Lie groups of the affine algebras, by translating the 
previous subsections into this geometric language. See the book [465] for more details. 
Loop groups appear directly in Wess—Zumino—Witten string theory, and in the study of 
certain differential equations (solitons), but otherwise the affine algebra is mathematically 
prior. From our (limited) perspective, the geometric insight gained isn’t obviously worth 
the analytic subtleties. 

Choose any compact Lie group G, and let g be its Lie algebra. By the loop group LG 
we mean all smooth maps S! — G, and by the loop algebra Lg we mean all smooth 
maps S! — g. The loop group LG has a group structure given by pointwise product, 
and in fact it forms an infinite-dimensional Lie group with Lie algebra £g. 

Think of G as a subgroup of U,,(C), as we can. The polynomial loop group L polyG is 
the set of all loops y € £G that can be written in the form 


(oe) 


y(z)= > Amz”, 


m=—CO 


i.e. as a matrix-valued function, where z € S! and each am is an n x n complex matrix, 
with all but finitely many am = 0. Note that £ po1yG is indeed a group — for example, 
inverse is given by y (z)! = Da alz ™ € LyotyG. However, note that Leas consists 
of the monomials az” for some constants m € Z anda € S! C C (to see this, multiply 
y(z) by y(z)'; the result is a Laurent polynomial in z with coefficients in C, which 
identically equals 1 for uncountably many z € C). Thus £po/yS! has Lie algebra iR 4 
Lpoly S ' For semi-simple G, however, £ polyG has Lie algebra £po1y9, as we'd like. 

The loop group £G is generally better behaved than £ pory G. For example, we know 
the exponential map exp: g — G is onto and locally one-to-one. The exponential map 
Lg — LG is defined in the obvious way (as the exponential of a matrix-valued function), 
and it is locally (but not globally) both one-to-one and onto. On the other hand, the 
exponential of a Laurent polynomial will usually not be a Laurent polynomial, and so 
the exponential map doesn’t exist for polynomial loops. By way of comparison, as we 
mentioned in Section 3.1.2, exp: Vect(S!) —> Diff(S') is neither locally one-to-one nor 
locally onto (in fact its image is nowhere dense). 

Diff(S!) acts naturally on LG, by changing the parametrisation of the loop (for simple 
G, the only other automorphisms of LG come from the loop group of Aut(G)). 

To enrich the representation theory of LG, we ve centrally extend LG by S!. For simple 
G, LG has an inequivalent central extension LG: for each n= 0,1,2,..., and these 
exhaust all of them. Lo 2 = Da ' x LG is the trivial extension; were is the unique simply- 
connected such extension. LG, is obtained from LG: by quotienting by the order-n 
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subgroup of the centre S!. The Lie algebra of any EG n > 0, is isomorphic to the 
unique nontrivial central extension of the loop algebra Lg. 

We’re interested in continuous projective representations of LG by bounded operators 
in a Hilbert space H. We want these as usual to be Z-graded. But an S! action is the 
same as a Z-grading. More precisely, consider the group S! of rigid rotations Rọ in LG — 
that is, a loop y(t) € LG gets sent to the loop (Rey)(t) = y(t — 0) for some fixed 
0 < @ < 27. We can decompose this S! action on H Fourier-like into (the completion 
of) a direct sum 


EHE) 


of subspaces H(£) on which Re acts like e~'“’. In other words, e™™+® represents Rg. 


We require H(£) to vanish for all £ sufficiently close to —oo. Because of the conformal 
field theory interpretation given next chapter, these eigenvalues £ are thought of as energy, 
and these representations are called positive energy representations. Any such projective 
representation of LG lifts to one of the semi-direct product of this S! with any central 
extension gem This double S!-extension of LG corresponds to the double C-extension 
of the (polynomial) loop algebra performed in Section 3.2.2. 

Let G be semi-simple. Any projective representation H of LG of positive energy 
is unitary and hence is completely reducible into a discrete direct sum of irreducible 
representations. The above action of S! (through the operators e~!4°) extends to a 
projective action of Diff* (S1). The Lo-eigenspaces H(€) of any irreducible representation 
H are all finite-dimensional. We can refine these eigenspaces by choosing a maximal 
torus T of G (it will be isomorphic to S! x --- x S! (r times), where r is the rank of 
G). We can diagonalise this action of S! x T x S!, where the first S! is from the rigid 
rotations, and the second from the central extension; then 


Hn) = Syuer,@H, u, k) 


is the corresponding diagonalisation into weight spaces. Of course we are rediscover- 
ing the weight-space decomposition in, for example, (3.2.9a). The ‘rigid rotation’ S! 
corresponds to the extension of the loop algebra £ pory g by the derivation —£ọ, and the 
projective Difft(S!) action corresponds to the Virasoro action (3.2.15). The maximal 
torus S$! x T x S! of the double extension of LG corresponds to the (real) Cartan subal- 
gebra h of g”. Given any irreducible projective representation of LG of positive energy, 
then the derived projective representation of Lg, restricted to Lp iyg, is an integrable 
highest-weight representation L(A) of g”. Conversely, any such representation of g” 
lifts (‘integrates’) to a projective representation of positive energy of LG. 

Any irreducible projective representation of £G lifts to a true representation of the 
simply-connected LG 1. It lifts to a true representation of LG, iff n divides the level k. 

The analogue of Borel—Weil applies here much as in Section 1.5.5; the role of the 
symmetric space G/T is played here by the infinite Grassmannian LG /T (see chapter 11 
of [465]). The irreducible representations also fit in well with Kirillov’s orbit method 
[198]. 
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It is tempting to hope more generally that the group Map(M, G), for any manifold 
M and compact G, should be a relatively accessible class of infinite-dimensional Lie 
groups. However, the theory is much more difficult than CG = Map(S!, G) and little is 
known about their representations (see chapter 3 and section 9.1 in [465]). 


Question 3.2.1. Define A to be the space of all differential operators of the form 
ae ez m,n X” d” /dx”, where all a&m,„ € C and all but finitely many am,n equal 0. Define 
a Lie algebra structure on A in the obvious way. Prove that A is a simple Z-graded Lie 
algebra of polynomial growth. 


Question 3.2.2. For a manifold X and Lie algebra L, when is Map(X, L) a simple Lie 
algebra? 


Question 3.2.3. Show that the Witt algebra acts on the affine algebra g” as derivations. 


Question 3.2.4. Show that a highest-weight representation of a nontwisted affine Lie 
algebra g = X,-“) with highest weight (A, k, u) is isomorphic as a g-module to one with 
highest weight (A, k, 0), when k Æ 0. 


Question 3.2.5. Classify all invariant symmetric bilinear forms for A;®. 


Question 3.2.6. Compute the cardinality PEI for all series A,“?, B.,C,, D,®. 
(Hint: this can always be done using one or two binomial coefficients.) 


Question 3.2.7. The affine Weyl group of A;‘” has two generators, which we call here 
w and t. These act on Z? as follows: 


æla, b) = (—a,b + 2a), t(a, b) = (3a + 2b, —2a — b). 


(a) Find a formula for the action of t” on (a, b). Find the orders of œw and ¢, and the 
determinants det(w) and det(f). 
(b) Let £ = (a, b) € Z? obey k := a +b > 0. Write p = (1, 1). Show that the affine 
Weyl orbit of 6 + p intersects 


PR? := {(1,k + 1), 2, k), ... , (k, 2), (k + 1, 1)} 


in at most one point, and that the orbit fails to intersect P TT iff 6 + p is fixed by some 
nontrivial element of the affine Weyl group. 


3.3 Generalisations of the affine algebras 


Affine algebras are fascinating because they draw together so many different areas of 
mathematics and physics. Like anything else, they embed into assorted families in plenty 
of ways, each embedding preserving some properties and losing others. But do they 
embed into a much larger family of algebras that are also of interest outside Lie theory? 

Generalisation is not the point of mathematics, and in fact, one must be honest, is 
usually rather dry. The challenge is to generalise in a rich and revealing direction. One 
of the more reliable ways of doing this is closure. Suppose we like to perform a certain 
activity, which unfortunately sometimes results in our toys being flung from our sandbox. 
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Then we build a bigger sandbox. When we divide integers, we don’t always get integers, 
so we construct the rationals. When we take limits of rationals, we don’t always get 
rationals, so we construct the reals. When we take square-roots of reals, we don’t always 
get reals, so we construct the complex numbers. 

Another appealing strategy for generalisation — analogy — was followed by Moody at 
the birth of Kac—Moody algebras (Section 3.2.1). However this strategy, even in the hands 
of a master, will not always be successful. This section reviews various generalisations 
of affine algebras, all obtained through analogy. Most important for our story are the 
Borcherds—Kac—Moody algebras, which have played a key role for instance in the proof 
of the Monstrous Moonshine conjectures. 


3.3.1 Kac—Moody algebras 


Recall the presentation RI, R2 of simple Lie algebras given in Definition 1.4.5, defined in 
terms of a Cartan matrix CI-C4. From the point of view of generators and relations, the 
step from ‘finite-dimensional simple’ to ‘Kac—Moody’ is rather easy: the only difference 
is that we drop the ‘positive-definite’ condition c4 (which was responsible for finite- 
dimensionality). That is: 


Definition 3.3.1 (a) A Cartang y matrix A is any £ x £ integral matrix A obeying C1, 
C2, C3 (see Definition 1.4.5(a)), together with 


c4' there exists a positive diagonal matrix D such that the product AD is symmetric (i.e. 
(AD) = AD). 

(b) Given any Cartanxy matrix A, the Kac—Moody algebra g = g(A) is the Lie 
algebra with generators e;, fi, hi, subject as before to the relations RI and R2 (see 
Definition 1.4.5(b)). 


What we call Kac—Moody algebras are usually called symmetrisable Kac—Moody alge- 
bras in the literature. The adjective ‘symmetrisable’ emphasises the requirement C4’, 
which we shall always assume; dropping it means losing the invariant bilinear form, 
among other things. What we call “Cartan y matrix’ here is usually called ‘generalised 
symmetrisable Cartan matrix’, but although that use of the word ‘generalised’ is tradi- 
tional, it is now inappropriate (see Definition 3.3.4 below). More generally, appending 
‘generalised’ to a term is an unimaginative empty cop-out that should be banned. 

The theory of Kac-Moody algebras is quite parallel to that of the finite-dimensional 
simple Lie algebras. They are also generated by (finitely many) A; subalgebras. Most 
entries of A again are zero, so it is most convenient to graphically represent A using 
the Coxeter-Dynkin diagram (recall their definition in Section 1.4.3). As before, we 
may without loss of generality take the Cartanx y matrices to be indecomposable (i.e. 
consider connected diagrams). 


Lemma 3.3.2 ((328], section 4.3) Let A be an indecomposable Cartanx m matrix. 
Then exactly one of the following possibilities holds: 
(Fin) det(A) Æ 0 — there exists a column vector u > 0 such that Au > 0; 
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(Aff) the nullspace (i.e. 0-eigenspace) of A is one-dimensional — there is a column vector 
u > O such that Au = 0; 
(Hyp) there is a column vector u > 0 such that Au < 0. 


If the Cartanx y matrix A is of finite type, then the corresponding Lie algebra g(A) 
is finite-dimensional and simple. If the matrix A is of affine type, then the algebra 
g(A) is infinite-dimensional, but has a Z-grading g(A) = >> j 9 into finite-dimensional 
subspaces g; where dimensions dim(g;) grow at most polynomially with j (see Sec- 
tion 3.2.1). The affine algebras come in two flavours — nontwisted and twisted — 
and are listed in Figures 3.2 and 3.3. For A of hyperbolic type, again g(A) has a 
Z-grading into finite-dimensional subspaces gj, but their dimensions dim(g;) grow 
exponentially with j. We are mostly interested in the nontwisted affine algebras 
(Section 3.2). Relatively little is known about the hyperbolic ones (but see Section 3.4.3). 

The relation between the realisation in Section 3.2.2 of an affine algebra as a loop 
algebra and the presentation of Definition 3.3.1(b) is as follows. Consider for simplicity 


2 —2 
A,"). The relevant Cartang y matrix is A = ); then g(A) = Lyoiy(A1) ® 


—2 2 
CC, with the isomorphism identifying 


0 1 0 0 1 0 
a(o ae f(t ay mo (6 a 

0 t 0 0 1 0 
ar (4 a fore (2 ae ho c-(4 = 


More generally, the central term C of the affine algebra is given by C = )0, ay hi. Note 
though that we are missing the derivation £9; we will return to that shortly. 

For indecomposable A, g(A) is simple iff the determinant det(A) 4 0. When det(A) = 
0, g(A) has a centre of dimension £ — m where m is the rank of the matrix A. 

The basic structure theorem for Kac—Moody algebras is: 


Theorem3.3.3 Letg = g(A) be asymmetrisable Kac-Moody algebra (over R). Then: 
(a) g has triangular decomposition g = g4 ® h ® g_ where g, is the subalgebra gener- 
ated by the e;, g_ is generated by the f; and h = span{h;} is the Cartan subalgebra; 
(b) g has a root space decomposition — formally calling e; degree a; and f; degree —a; 
and defining 9a to be the subspace of degree a € Za, + Zaz +--+, we geth = go 


and g+ = Baca. Ga, where [ga, 9p] C Ba+p and A- = — 44; 
(c) there is an involution w on g for which we; = fi, oh; = —h; and wga = g-a; 
(d) dim gy < œ and dim gia, = 1; 
(e) there is an invariant symmetric bilinear form (-|-), that is ([ab]|c) = —(b|[ac]), such 


that for each root a 4 0 the restriction of (-|-) to ga X G—q is nondegenerate and 
(galge) = 0 whenever B + —a; 

(£) there is a linear assignment a +> ha € h such that for alla € ga, b € g—a we have 
[a, b] = (a|b) ha. 


These a are called roots and the œ; simple roots, as before. The roots œ can be regarded 
as linear functionals on h, in such a way that for any x € ga and h € h, we have [hx] = 
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a(h) x. The involution in (c) is the Cartan involution, and is needed in defining unitary 
representations. The bilinear form in (e) is the generalisation here of the Killing form. 
For simple roots a;, ha, in (f) is h;, and is sometimes denoted a,” and called a co-root. 
The field was taken to be R here for convenience (Question 3.3.1). 

When det(A) = 0, the bilinear form restricted to h will be degenerate and the simple 
roots interpreted as linear functionals on h will be linearly dependent. To get around 
this, extend the Cartan subalgebra by dim(Null(A)) = £ — m more vectors. Call h° the 
resulting (2€ — m)-dimensional space. Extend the bilinear form to b° so that it becomes 
nondegenerate, and the domain of the simple roots a; € h* to all of h° so they become 
linearly independent. Up to equivalence, there is a unique way to do this. The space 
g(A)* := g(A) + b° is given a Lie algebra structure by extending the relations of Defi- 
nition 3.3.1(b) to include 


[hh'] = 0, Yh, h €b, (3.3.1a) 
[he;] = a;(A), Whe b°, (3.3.1b) 
[Afi] = —a;(h), Vheh’. (3.3.1c) 


For a Cartanx m matrix A of affine type, g(A)° is isomorphic to the corresponding algebra 
g = 9°”? we defined in Section 3.2.2: the extra vector is the derivation £o. Whenever 
det(A) = 0, g(A)* and not g(A) is the correct algebra to consider. Write g(A)° := g(A) 
when det(A) Æ 0. Theorem 3.3.3 holds for g(A)’, provided h there is replaced with h°. 

Unlike the finite-dimensional case, some root multiplicities mult(@) := dim gy may 
be > 1. The roots of g(A)° come in two flavours: real (with (a|a) > 0) and imaginary 
(with (ala) < 0). The simple roots are all real. Real roots behave exactly like the roots 
of finite-dimensional g: for example, mult(a@) = 1 and the only multiples of @ that are 
also roots are +œ. Imaginary roots behave more like the nonroot 0 € h*: for example, 
mult(a~) > 1 and any multiple Za is also a root. 

The Weyl group W here is generated by the reflections through the simple roots œ;, or 
equivalently by reflections through all real roots. It has the usual properties: for example, 
root multiplicities are constant within the W-orbits. 

A Kac—Moody algebra g(A)° has all the familiar representation-theoretic definitions 
and properties. For any weights A € h, Verma modules M(A) and the irreducible 
highest-weight module L(A) are defined as usual. In particular, highest-weight mod- 
ules are spanned by vectors of the form 


Fin Paget fat (3.3.2) 


where v is the highest-weight vector. Weight-space decompositions hold as before, and 
characters chy (A) are defined as in (1.5.9a). The character of the Verma module M (à) 
again equals (3.2.6). Integrability is defined by the locally nilpotent condition (Sec- 
tion 3.2.3); again, L(A) is integrable iff all Dynkin labels A(h;) € N, iff L(A) is unitaris- 
able. The character of an integrable L(A) is given by the Wey/—Kac character formula 


X wew det(w) ev +P) 
eP Teso N e~% multa) . 


chro) = (3.3.3) 
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This is identical to the Weyl character formula (1.5.11), except that the sum and product 
are infinite, and the multiplicities of (imaginary) root spaces can be > 1. For affine 
algebras, it reduces to (3.2.1 1c). 

Apart from the affine and finite-dimensional simple algebras, the other Kac-Moody 
algebras have yet to make a real impact on other areas of mathematics and mathematical 
physics. However, [127] and [171] anticipate that the hyperbolic Kac—Moody algebras 
E10 and E4; will appear in M-theory, the still-hypothetical physics underlying strings. 


3.3.2 Borcherds’ algebras 


In his efforts to prove the Monstrous Moonshine conjectures, Borcherds further gener- 
alised affine algebras. It is easy to associate a Lie algebra to a matrix A, but which class 
of matrices will yield a deep theory? Borcherds found such a class by holding in his hand 
a single algebra — the fake Monster Lie algebra (Section 7.2.2) — which acted much like 
a Kac—Moody algebra, even though it had imaginary simple roots. 


Definition 3.3.4 (a) A Cartangx m matrix A is a (possibly infinite) matrix A = (aij), 
aij € R, obeying 


GCl. either aii = 2 or aj; < 0; 
GC2. aij < Ofori # j, and aj; € Z when aii = 2; and 
Gc3. there is a diagonal matrix D with each di; > 0 such that DA is symmetric. 


(b) The universal Borcherds—Kac—Moody algebra § = §(A) is the Lie algebra with gen- 
erators ei, fi, hij, subject to the relations [71]: 


GRL. [ei fj] = hij, ijek] = bijainer and [hij fk] = —ôi jair fa, for all i, j; 
GR2. (ad e;)'~e; = (ad fi)!“ fj = 0, whenever both ai; = 2 and i # j; and 
GR3. [e;e;] = [fif] = 0 whenever aij = 0. 


As before, the adjective ‘symmetrisable’ is usually appended in the literature. Unfor- 
tunately, the name ‘Borcherds’ is often replaced with the abomination ‘generalised’. 
Note that for each i, span{e;, f;, hii} is isomorphic to sl,(C) when aj; 4 0 and to 
Heis (recall (1.4.3)) when a;; = 0. Immediate consequences of the definition are that: 
© [hijhmn] = 0; Gi) hi; = O unless the ith and jth column of A are identical; (iii) the 
hij for i Æ j lie in the centre of g. Setting all h;; = 0 for i 4 j gives the definition 
of the Borcherds—Kac-Moody algebra g = g(A) [69]. This central extension of g is 
introduced for its role in Theorem 3.3.6 below. If A has no zero columns, then g equals 
its own universal central extension [71]. Because a Borcherds—Kac—Moody algebra can 
satisfy fewer relations, it typically contains a large free Lie subalgebra [323] (a free Lie 
algebra is analogous to a free group). 

A universal Borcherds—Kac—Moody algebra differs from a Kac—Moody algebra in that 
it is built up from Heisenberg algebras as well as A;, and these subalgebras intertwine in 
more complicated ways. Nevertheless, much of the theory for finite-dimensional simple 
Lie algebras continues to find an analogue in this much more general setting (e.g. 
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root-space decomposition, Weyl group, character formula, . . . ). This unexpected feature 
is the point of Borcherds—Kac—Moody algebras. 

To get a feel for these algebras, let us prove a few simple results concerning the h;;. Note 
first that, using the above relations together with anti-associativity, we obtain [h;;hxe] = 
ôij(a jk — Aje)hge. Comparing this with [hyehi;] = —[hijhxe], we see that bracket must 
always equal 0. Hence all h’s pairwise commute, and h;; = 0 unless the ith and jth 
columns of A are identical. 

The basic structure theorem is that of Kac-Moody algebras (Theorem 3.3.3): 


Theorem 3.3.5 [69] Let g = g(A) be a Borcherds—Kac—Moody algebra (over R). 

Then: 

(a) g has triangular decomposition g = g+ ® h ® g- where g, is the subalgebra 
generated by the ei, g— is generated by the f; and h = span{h;} is the Cartan 
subalgebra; 

(b) g has a root space decomposition — formally calling e; degree a; and fi degree —a;, 
and defining gq to be the subspace of degree a € Za, + Zæ + ---, we geth = go 
and g+ = Dacs Bo, where [ga, 98] C 9o+g and A- = — A+; 

(c) there is an involution w on g for which we; = fi, wh; = —h; and wga = g-a; 

(d) dim ga < œ and dim gto, = 1; 

(e) there is an invariant symmetric bilinear form (-|-) such that for each root a + 0, the 
restriction of (-|-) to ga X G—a is nondegenerate and (ga|gg) = 0 whenever B A —a; 

(£) there is a linear assignment a +> hy € h such that for alla € ga, b € g-a, we have 
[a, b] = (aļ|b) ha. 


The condition that g be symmetrisable (i.e. condition GC3) is necessary for the existence 
of the bilinear form in Theorem 3.3.5(e). As in Section 3.3.1, it is common to add 
derivations. In particular, define Dj(a) = n;a for any a € gn,o,+...; then each linear map 
D; is a derivation, and adjoining these to h defines an abelian algebra h°. The simple 
root a; can be interpreted as the element of h°* obeying a ;(h;) = aij and a;(D;) = 4;;. 
The role of the derivations is to make these simple roots linearly independent. Construct 
the induced bilinear form (-|-) on h®*, obeying (a; | aj) = djaj; (see [322] for details). 

The properties in Theorem 3.3.5 characterise Borcherds—Kac—Moody algebras (see 
e.g. [72] for a proof): 


Theorem 3.3.6 Let L be a Lie algebra (over R) satisfying the following conditions: 
(i) L has a Z-grading @;L;, and dim L; < œ for alli #0; 

Gi) L has an involution w sending L; to L_; and acting as —1 on Lo; 

(iii) L has a contravariant bilinear form (-|-) such that (L;|L ;) = 0 ifi 4 —j, and 

such that —(a|o(a)) > O if0 Aa EL; fori £0. 

Then there is a homomorphism x from some B(A) to L whose kernel is contained in the 

centre of (A), and L is the semi-direct product of the image of x with a subalgebra of 

the abelian subalgebra Lo. That is, L is obtained from G by modding out some of the 

centre and adding some commuting derivations. 
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Conversely, any (real) Borcherds—Kac—Moody algebra obeys conditions (i), (ii) and (iii). 
For example, let L = sh (R) and recall (1.4.2b). Then L has Z-grading L-1 ® Lo ® Lı = 
Ce @Ch @ Cf, a(x) = —x' and (x|y) = tr(xy). Theorem 3.3.6 tells us that Borcherds— 
Kac—Moody algebras are the ultimate generalisation of simple Lie algebras, in the sense 
that any further generalisation will lose some basic structural ingredient. 

Let IT”? be the set of all real simple roots, i.e. all a; with aj; = 2; the remainder are the 
imaginary simple roots a € T1'". The Weyl group W of g° is generated by the reflections 
ro, : B% —> h% foreach a; € II°: ri(à) = à — A(h;) æi. It is a crystallographic Coxeter 
group (Section 3.2.1). The real roots of g° are defined to be those in W(I1"°); all other 
roots are called imaginary. For all real roots, dim (g°)* = 1 and (« |œ) > 0. 

Integrable highest-weight modules are defined as before: namely, each ey, fa must 
act locally nilpotently for all real roots œ. More precisely, V = ®pepe V, where the 
weight-space V,,:={v € V |h.v = u(h)v}, with dim V, < oo, and whenever aj; = 2, 
(e.v = 0 = (f;)*.v for all v € V and all sufficiently large k. By the character we 
mean the formal sum chy := )°,, epe: (dim V) e”. Let P4 be the set of all weights A € h** 
obeying A(h;) € N whenever aj; = 2, and A(h;) > 0 for all other i. Define the highest- 
weight g°-module L(A) in the usual way as the quotient of the Verma module by the 
largest proper graded submodule. Choose p € h™ to satisfy (p | a;) = 5 (a |@;) for all 
i, and define S}, = e*+? X, e(s) e5 where s runs over all sums of imaginary simple roots 
and e(s) = (—1)” if s is the sum of m distinct mutually orthogonal imaginary simple 
roots, each of which is orthogonal to À, otherwise €(s) = 0. Then we get the Weyl—Kac— 
Borcherds character formula: 


È wew E(w) wS) 


eP Thea. @ = e7% multo 


chro) = (3.3.4) 


(compare (3.3.3)). S, is the correction factor due to imaginary simple roots. 

Thus Borcherds’ algebras strongly resemble Kac—Moody ones and constitute a natural 
and nontrivial generalisation. The main differences are that they can be generated by 
copies of the Heisenberg algebra as well as sl,(IR), and that there can be imaginary 
simple roots. For more on their theory, see, for example, [328] chapter 11.13, [272], 
[322], [469]. Interesting examples are the Monster Lie algebra (Section 7.2.2), whose 
(twisted) denominator identity supplied the relations needed to complete the proof of 
the Monstrous Moonshine conjectures, and the fake Monster [70]. A Borcherds—Kac— 
Moody algebra can be associated with any even Lorentzian lattice, and also with any 
Calabi-Yau manifold [275]. Of course it is a broad enough class that almost all of 
them will be uninteresting; an intriguing approach to identifying the interesting ones is 
sketched at the end of Section 3.4.3. 

We know simple Lie algebras arise in both classical and quantum physics, and the 
affine Kac—Moody algebras are important in conformal field theory, as we see next 
chapter. Borcherds—Kac—Moody algebras have appeared in the physics literature in the 
context of BPS states in string theory (see [275]), and as a possible symmetry of M -theory 
[285]. 
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3.3.3 Toroidal algebras 


As mentioned in Section 3.2.6, replacing the loop algebra S! —> g with more general 
spaces M — g has a very different theory and seems much more complicated. The 
most obvious generalisation of affine algebras, which has a chance of retaining some 
of their special properties, is to replace the loop algebra S! —> g with a space of maps 
S! x.. x S! + g. As S! x --- x S! (n times) is topologically the n-dimensional torus, 
these are called toroidal algebras. We will try to mimic the theory of loop algebras as 
far as we can. If nothing else, we will identify some features responsible for making the 
earlier theory so special. 

Let g be a simple finite-dimensional Lie algebra. Choose any n > 1, and let § be 
the multi-loop algebra, i.e. tensor product g ® C[t4 ORN t+!] of g with Laurent poly- 
nomials in formal variables t;. Then g is a Lie algebra with Z"*!-grading into finite- 
dimensional subspaces. The following theory treats as distinguished one of these n + 1 
variables, namely tọ. To complete the construction of the toroidal algebra, we take the 
universal central extension 0 > K > §®K —> Y — 0 of the multi-loop algebra §, 
and then adjoin sufficiently many derivations (as we’ve done throughout this chap- 
ter). However, both of these extensions are infinite-dimensional. More precisely, write 
di = t;d/dt; for the degree-derivation for variable t;. Let D* denote the algebra of deriva- 
tions @7_,C[tp Beate t*!]d; ® Cdo. The resulting Lie algebra structure on the space 
Y D K @ D* is uniquely determined up to a 2-cocycle t : D* x D* > K, which defines 
how the bracket of derivations contributes a central term. There is a two-dimensional 
space of these t; choosing any of them defines a toroidal Lie algebra g,. Adding D* 
reduces the centre from the infinite-dimensional K to an (n + 1)-dimensional space. See 
[53] for more details of the construction of g+. 

The role of the Virasoro algebra (which as we know is a central extension of 
Der(C[t*!]) = Vect(S!) & C) is here replaced by an abelian extension [173] of the com- 
plex vector fields on a torus or equivalently of Der(C[#j saver t+!]). It is a Lie algebra 
WU, parametrised by the 2-cocycle t, defined on the space K ® Der(C[tq Meee t+1]). 
Y, acts for instance on the Verma modules of g+. We will be more interested in the Lie 
subalgebra vb, = K @ D* of g+. The modules constructed below carry a projective action 
of the Witt algebra Cte" Ido, as in the affine setting. 

Affine algebras exist for their (integrable) modules and in particular their characters, so 
we need to find an interesting class of modules for the toroidal algebras. This isn’t easy to 
do, but major progress was made in [53]. Let L, be an irreducible highest-weight module 
of level k Æ 0, for the affine algebra g”, and let W be any finite-dimensional module 
for gly. Then [53] constructs an irreducible g,-module M, w that has finite-dimensional 
homogeneous spaces with respect to the natural Z”*!-grading, and thus has a character. 
More precisely, they first obtain a v,-module by applying a Verma-like construction to 
W OClt; Be ce t 1], and then they take the irreducible quotient Mw as usual; finally, 
they define a g,-module structure on the tensor product M} w := L, ® My. In [54] they 
show that these are modules of a ‘near-vertex operator algebra’ (see Definition 5.1.3(c)) 
closely related to affine algebra vertex operator algebras at generic level. From this, 
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their characters can be computed, and familiar modular forms arise. This is promising 
because interesting Lie algebra modules seem to be the ones that arise as modules of 
related structures (e.g. Lie groups or vertex operator algebras). It is too easy to be a Lie 
algebra module. On the other hand, these are surely not the best g,-modules — they have 
only found the analogue of generic L(A), but not yet the analogue of the ‘integrable’ 
modules. Their characters are like (3.1.7a), but we would like to identify modules with 
characters analogous to the discrete series. By analogy with better-understood algebras, 
we should look for modules with maximal numbers of ‘null vectors’ quotiented out. 

It may seem artificial to choose a distinguished direction (namely the Oth), but to 
some extent this is inevitable. It is an elementary consequence of Schur’s Lemma (recall 
Lemma 1.1.3) that in these irreducible g,-modules, the centre span{Ko, ..., Kn} should 
act as scalars, and thus an n-dimensional subspace must act trivially. These representa- 
tions are designed so that Ko is nontrivial but the other K; act trivially. 

What is natural to pursue from, for example, an algebraic point of view, and what is 
a successful theory from that point of view, is not necessarily of more general interest. 
It is from this broader, multidisciplinary standpoint that we (unfairly) judge the value 
of these generalisations. There is a large class of g,-modules (namely those described 
above) whose characters have (fairly weak) modularity properties, but this seems to 
arise solely from the well-milled Heisenberg algebra combinatorics and it isn’t clear yet 
that they have independent value. Possible physical relevance in Wess—Zumino—Witten 
models in more than two space-time dimensions is explored in, for example, [306]. The 
jury is still out on the greater relevance of toroidal algebras to, for example, Moonshine 
or physics, and certainly more work is needed. 


3.3.4 Lie algebras and Riemann surfaces 


The previous subsection emphasises the difficulties of higher-dimensional analogues 
of loop algebras. Perhaps the best generalisation of the affine algebras, particularly in 
the sense of retaining and enriching automorphic properties of the characters, asso- 
ciates infinite-dimensional Lie algebras to each Riemann surface with marked points. 
This theory has been developed in a series of papers by Krichever—Novikov, Bremner, 
Schlichenmaier, Sheinman and others — see [491] for a list of references. The starting 
point is a reinterpretation of the Laurent polynomials $- ant” € Lpoiyg. Before, we inter- 
preted the formal variable ¢ as a point on the unit circle S! C C, but now we regard t 
as lying in the punctured plane C\{0}, or equivalently the twice-punctured Riemann 
sphere P’(C). Similarly, the Witt algebra Vect(S!) can be interpreted as the Lie algebra 
of meromorphic vector fields on P’(C) with possible poles only at 0 and oo. 

Let & be any Riemann surface of genus g, and choose p > | distinct ordered points 
P = (Z1,...,Zp), Zi € ©. In the language of string theory described next chapter, we 
can think of £X as being a world-sheet corresponding to p asymptotic incoming or 
outgoing strings (Section 4.3.1). Let Ay p be the space of functions meromorphic on ©, 
with possible poles only at P, and let Ly p be the space of meromorphic vector fields 
on È, again with possible poles only at P. The bracket of Ly p comes from the Lie 
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derivative, as usual with vector fields, while the bracket of Ay p is taken to be trivial. 
Let g be any simple finite-dimensional Lie algebra. The loop algebra £,,.1yg is replaced 
with gy p := Ayp Q g, with bracket [}°, f; Q xi, D g  @y l= Das fig j[xiy;]. The 
Laurent polynomials C[t*'] are replaced with Ay, p. The Witt algebra is replaced with 
Ly, p. Just as Witt acts on Lyoiyg by derivations, so does Ly p act on gy, p. 

There are some subtle differences with the more familiar loop algebras. The loop 
algebras have an important Z-grading. These higher-genus algebras Ly p and gy p 
have instead an almost-grading by Z, in the sense that Ly p (say) can be decomposed 
Ly, p = (Lx, p), as a vector space into finite-dimensional subspaces (Ly, p),, such that 


(Lz, P)m, (Lz,P)nl E PTM (Lz, pP) 


for some fixed integers L, M € Z. This would be a true grading if M = L = 0. The 
algebra gy, p behaves similarly. The subspaces (Ly, p)» and (gx,p)n are defined by con- 
sidering orders of poles (and splitting P into incoming and outgoing points). 

In the loop algebra situation, for g simple, there is a unique nontrivial central extension. 
On the other hand, gyp typically has several. However, only one will be compatible with 
the almost-grading, and so that is the one we choose. Call it gy p. Similarly, we get a 
unique central extension Les of Ly p, which in the special case of a sphere with one 
incoming and one outgoing puncture is Wir. 

Verma modules, etc. for gs,p can be defined as before using the universal enveloping 
algebra, and are parametrised by p = ||P || highest weights A“?,...,A% € h* and a 
complex number k (the level). For these modules Wax), 4 = (A, ..., A), there is an 
analogue of the Sugawara construction (3.2.15), which shows that each of these gs. p- 
modules Wx) is simultaneously a Lz p-module, in perfect analogy with the affine 
situation. 

Physically, these algebras gs p and Lae should be regarded as higher-genus global 
symmetries for, for example, the Wess—Zumino—Witten models discussed next chapter. 
Locally, that is in terms of local coordinates at each marked point z;, we get a copy of the 
affine algebra g™ and Virasoro algebra Yir. A module for, for example, gg p Similarly 
specialises to the g” -module L(A“) at each point z; € P. 

The theory is still a work in progress — see, for example, [491], [492] and references 
therein. But it can be expected that for each positive level k and choice of X, and 
p highest weights A“ € P<(g), a number of level-k representations of gs,p will be 
singled out (the exact number being given by Verlinde’s formula (6.1.2)), and these will 
‘transform covariantly’ with respect to the mapping class group of X\ P. Obviously this 
is an exciting direction that should be pursued, with direct relevance to higher-genus 
Moonshine (Section 6.3.1). 


Question 3.3.1. (a) Define D = J Jaca, — e~e multo), Verify r;(D) = e~“ D. 
(a) Find a vector r € h such that w(e" D) = e(w) e" D. 


Question 3.3.2. Let A be a Cartanggm matrix, and g the corresponding universal 
Borcherds—Kac—Moody algebra. 
(a) Prove h;; lies in the centre. 
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(b) Suppose the ith and jth rows of A are identical. Then show that h;; — hj; is in the 
centre of g. 


Question 3.3.3. In what ways (if any) do Theorems 3.3.3, 3.3.5, 3.3.6 change if the field 
is C and not R? 


Question 3.3.4. Prove that for any Lie algebra L obeying conditions (i), (ii), (iii) of 
Theorem 3.3.6, Lo will be an abelian subalgebra. 


3.4 Variations on a theme of character 
3.4.1 Twisted #3: twisted representations 


In this subsection we complete the introduction of the twisted character which we began 
in Section 1.5.4. These are to the usual character what the McKay—Thompson series 
are to the j-function. In Section 5.3.6 we generalise this construction, but as always 
the special case of affine algebras is particularly pretty and significant. The reader is 
encouraged to reread Section 1.5.4 for background. 

Let’s start with a twisted affine algebra 9“, obtained as in (3.2.4) from the nontwisted 
algebra g = g” and an order-N symmetry œ of the Coxeter-Dynkin diagram of g. 
Consider any integrable highest-weight g)-module L(A), A € P k (g). Think of this 
as a representation p. We can extend p linearly to g, by defining 


p(xt") = EY "p(xt"), (3.4.1a) 


for x in the a-eigenspace (g); (Section 1.5.4). This isn’t a true representation of g — it’s 
called a twisted representation of g, as it obeys 

[o(xt"), PON = Ey" ollt”, yt"), (3.4.1) 
when x € (g); and y € (g);. Thus a true representation of the twisted affine algebra go” 
corresponds to a twisted representation of the nontwisted algebra g. In Section 5.4.6 
we extend this notion of twisted representation to vertex operator algebras. 

Twisted representations are vaguely reminiscent of projective representations. But a 
projective representation becomes a true representation when the algebra is extended, 
while a twisted representation becomes a true representation when the algebra is shrunk. 
Groups most naturally have projective representations, vertex operator algebras most 
naturally have twisted ones, and affine algebras have both. 

Consider more generally any symmetry a of the Coxeter—-Dynkin diagram of g. As 
in Section 3.2.2, æ extends to an automorphism of g (e.g. a(e;) = exi, and @ fixes the 
centre and derivation). Because of this, œ permutes the g-modules as in Section 1.5.4. 
In particular, œ takes the highest-weight module L(A) to L(A“), where (A%); = Agi, and 
moreover takes weight-space L(A), to weight-space L(A%),,«. All of this generalises to 
any Borcherds—Kac—Moody algebra. 

Now suppose à“ = A, that is A is a fixed point of æ. Then L(A) and L(A)? are isomorphic 
as g-modules, so let Tą be a linear isomorphism of the space L(A) that intertwines their 
g-actions: that is, a(x).v = x.T,(v) in terms of the g-action of L(A). Because L(A) is 
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irreducible, Tą is uniquely determined up to a scalar multiple; scaled appropriately, it will 
permute all vectors of the form (3.3.2). By the a-twisted character or twining character 
X% we mean 


c k (d — d”? 
Xx (h) = exp ( hy + 2 + ( ’) J tTLo)Tae" 


24 24hY 
Che. KAA Sd) 
= exp ( hy + A Te 8| JO tte, Vhebh (3.4.2a) 
u=ap 


where d and d°”? are the dimensions of the semi-simple Lie algebras g and gre (the 
algebra gr? is defined in Theorem 3.4.1) and hy, c, are in (3.2.9). As in (3.2.9a), the 
normalisation here is chosen to make modularity simplest — see (3.4.2b) below. As we see 
from (3.2.5d), the vector 6 € h* in (3.4.24) isolates the coefficient 27it of the derivation 
lo. 


Theorem 3.4.1 [213] Let g = X,™® be a nontwisted affine algebra, and let a be a 
symmetry of the Coxeter-Dynkin diagram of g. Then for any integrable highest-weight 
à of g, with ad = i, the a-twisted character x¥ (h), restricted to any h € b fixed by a, 
equals some true character xx(h) of the ‘orbit Lie algebra’ ao” = ((gP Jo)”. 


‘g°P’ is the affine Kac—Moody algebra whose Coxeter-Dynkin diagram is that of g 
except with all arrows reversed. Note that g”? is not a subalgebra of g, although its 
Cartan subalgebra h”? can be identified with that ho of the fixed-point subalgebra go. 
What is special about g””” is that there is a natural map P, (see Section 3.3 of [213] for its 
precise construction) sending g-weights fixed by a to the weights of g”? , and preserving 
all inner-products. The weight iin Theorem 3.4.1 is P(A). The normalisation in (3.4.2a) 
is exactly what one would expect for a character of g’: 
gre c k(d—d”*) 


ne =h : 3.4.2b 
à 24 a= ag E O An ( ) 


For example, consider g = Ady —1? and g= Ar, ®, respectively, with œ being the left— 
right reflection symmetry (‘charge-conjugation’) ‘C’ fixing the Oth node. Then the orbit 
Lie algebra g”? is the twisted affine algebras D,.4;° and A2,”, respectively. For g = 
sl,” with a cyclic symmetry (‘simple-current’) ‘J”/“’ of order d (so d divides n), 


g”? = sl, ja” . The map P, in these examples is 


n—-1 n 
Pc : owo + SS Aj (@i + Oni) + Anon > 5 rar”, 
i=l j=0 


n n 
Po : A000 + D> di (Wi + oni) > D> AP”, 
i=l j=0 
n/d—-1 n/d—-1 
P jna : 5 Ài (0; + Wi+n/d +--+ Wi+n—n/d) bt > rior”. 
i=0 j=0 
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The map Py is not mysterious. For example, for g = Shr,” anda = C, the fundamental 


weights wh of g”? are the obvious basis for the C -invariant weights of g, namely 
wer? = wi + @2,_; (for 1 <i < n) together with apr? = Wo and oo = Wy. 


The most important case in Theorem 3.4.1 is the degenerate one. The Coxeter—-Dynkin 
diagram of sl, has an order-n cyclic symmetry J. In this case, an w-fixed point looks 
like A = (Ag, Ag, - - - , Ao) for Ag = k/n, and the a-twisted character x¥ (h), restricted to 
h fixed by a, equals the t-independent function exp[2771 (A(h) + ku)] — that is, only the 
top weight-space survives. 

A good question in Lie theory is always rewarded with a beautiful answer. 
Theorem 3.4.1 holds more generally for any Borcherds—Kac—Moody algebra. The proof 
follows that of the Weyl—Kac—Borcherds character formula. 

We get from Theorems 3.4.1 and 3.2.4 that the twisted characters are modular func- 
tions, and obey an analogue of Theorem 3.2.3. As an isolated example, this is rather 
surprising, but it fits into a much larger context (Section 5.3.6). We also find there how 
modular transformations relate the twisted characters to twisted representations — it is 
quite analogous to (2.3.10b). From this greater context of vertex operator algebra mod- 
ules and characters twisted by automorphisms, the modularity of these twisted characters 
is not so surprising. What is more surprising is positivity, that is, the g-expansion has 
positive integer coefficients. This is true, for instance, for only two-thirds of the McKay— 
Thompson series T,. See Section 7.3.5, especially Conjecture 7.3.3, for an analogous 
result for the Moonshine module V”. 


3.4.2 Denominator identities 


A very useful formula for the characters of simple finite-dimensional Lie algebras g is the 
Wey] character formula (1.5.11). Itis rare indeed when the trivial special case of a theorem 
or formula is interesting. But that happens here. Consider the trivial representation: i.e. 
x +> 0 for all x € g. Then the character (1.5.9a) is identically 1: cho = 1. Thus the 
character formula tells us that a certain alternating sum over the Weyl group W equals 
a certain product over positive roots a € A}: 


[eer eae Y ewer O, (3.4.3) 


acA, wew 


Here, z lies in the Cartan subalgebra h, and the Weyl vector p is w +---+ œ. Equation 
(3.4.3) is called a denominator identity. For the smallest simple algebra Aj, (3.4.3) is 
trivial: 1 — e~? = e~*/*(e?/ — e~*/*), For Az we get a sum of six terms equalling a 
product of three terms, and the complexity continues to rise from there. 

In particular, look at g = sl,,(C). We can realise the roots, etc. of g in terms of an 
orthonormal basis {e;} of C” as follows: the positive roots are e; — e; forl <i < j <n; 
the Cartan subalgebra h is the hyperplane orthogonal to $`; e;; the Weyl group is the 
symmetric group S,, acting on C” and hence h by permuting the e;; the Weyl vector 
p= 5 X(n + 1 — 2i)e;. Write z = 0; z;e; € h and x; = e~* (so [ |; x; = 1). Then the 
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left side of (3.4.3) becomes 


I] (1 — e™=t) = 5° ae . ” JJ (x; — xi). 


l<i<j<n l<i<j<n 


The right side of (3.4.3) becomes 


Bla 2j)/2 5y €(s) [T+ (n+1—2i)/2 = x5!x an 3 Lgl >D em) IES 
i 


TES, TESy 


Thus the denominator identity for sl,(C) is simply the formula for the determinant of 
the Vandermonde matrix 


X1 X2 ges Xn 
X? oa o 2 
det] | ~ f= [] œw- (3.4.4) 
; 7 : l<i<j<n 
n n n 
Xi Xa eee Xn 


In the early 1970s Macdonald [396] generalised these finite denominator identities to 
infinite identities, corresponding to the extended Coxeter—-Dynkin diagrams. The simplest 
of his was known classically as the Jacobi triple product identity: 


oo [o0] 
[[0 -maa -aya y= Y ey. (8.4.5a) 
m=1 n=—0o 
To Macdonald these were purely combinatorial, but soon Kac, Moody and others rein- 
terpreted his formulae as denominator identities for nontwisted affine algebras, that is 
substituting A = 0 into the Weyl—Kac character formula (3.3.3). 

For example, parametrise the Cartan subalgebra of A,“ by za, + zlo + uC; then 
(3.2.5d) says (ma, + nô)(zæı + Tlo + uC) = 2mz — nt. The positive roots of A," are 
a, +né (n > 0), —a,; + nô (n > 1) and nô (n > 1). The Weyl group acts on the Weyl 
vector p bY tna, P = P + 2na; — (2n? + n)ô and taaa P = P + (2n — Lay — (2n? — 
n)ô. Thus the A,“ denominator identity is 


Ta = r) [Ja —r —1 aja —q ny = 3 (= 1)"r .—m q” +m)/2 (3.4.5b) 
n=0 n=1 


m=— 0 


—T —2z 


where q = e™™" andr =e 
= /qandy=qr'. 
Freeman Dyson is a famous quantum physicist, but started his academic life in number 

theory and still enjoys it as a hobby. Dyson [166] found a curious formula for the 

Ramanujan t-function, defined by )°°, t(n)q”" = n(q)* := q TZA — q”: 


<j<j< ( LS i) 
t(n) = > Ue T LA (3.4.6) 


where the sum is over all 5-tuples a; with a; =i (mod 5) obeying `; a; = 0 
and }; a? = 10n. Using this, an analogous formula can be found for n*4. Dyson 


knew that similar formulae were also known for ni for the values d = 3, 8, 10, 14, 
15, 21, 24, 26, 28, 35, 36,... 


. Equation (3.4.5a) is recovered from (3.4.5b) by setting 
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What was ironic was that Dyson found (3.4.6) at the same time that Macdonald was 
finding his own identities. Both were at Princeton then, and would often chat a little 
when they bumped into each other after dropping off their daughters at school. But they 
never discussed work. Dyson didn’t realise that his strange list of numbers has a simple 
interpretation: they are precisely the dimensions of the simple Lie algebras! 3 = dim(A)), 
8 = dim(A2), 10 = dim(C2), 14 = dim(G2), etc. In fact these formulae for n are none 
other than (specialisations of) the Macdonald identities. For example, Dyson’s formula 
is the denominator formula for A4™® (24 = dim(A4)). If they had spoken, they would 
surely have anticipated the affine algebra denominator identity interpretation. 

Incidentally, no simple Lie algebra has dimension 26, so the formula for 77° can’t 
correspond to any of Macdonald’s identities. Its algebraic meaning is still uncertain. 

Macdonald certainly didn’t close the book on denominator identities. Any algebra with 
a character formula analogous to (1.5.11) (e.g. Borcherds-—Kac—Moody algebras (3.3.4)) 
will have one. Kac and Wakimoto [336] use denominator identities for Lie superalgebras 
to obtain nice formulae for various generating functions involving sums of squares, sums 
of triangular numbers (triangular numbers are numbers of the form $k(k + 1)), etc. For 
instance, the number of ways n can be written as a sum of 16 triangular numbers is 


1 
3.43 Xab (a? = by, 


where the sum is over all odd positive integers a, b, r, s obeying ar + bs = 2n + 4 and 
a >b. 

The most important application of denominator identities from our perspective is 
Borcherds’ use of them (Section 7.2.2) in proving the Monstrous Moonshine conjectures. 
Indeed, this possibility was what motivated his introduction of the Borcherds—Kac— 
Moody algebras. Other applications are discussed next subsection. 

Explicitly writing down denominator identities for Borcherds-—Kac—Moody algebras 
tends to be quite difficult, because their root multiplicities are hard to find. The denom- 
inator identity of the Monster Lie algebra m is a remarkable identity originally due to 
Zagier, but discovered independently by Borcherds and others: 


pT [a pq" = J@)— J), (3.4.7a) 


m>0 
neZ 


with p = e?"", where the powers ‘a;’ are the coefficients of the g-expansion of the 


modular function J(t) = J`; a;q'. This yields infinitely many nontrivial polynomial 
identities in the coefficients a, — for example, comparing third-degree terms on both 
sides gives 


H= (Z) +a. (3.4.7b) 


In fact, (3.4.7a) is older than m and is proved independently (Hecke operators permit a 
quick proof); turning the logic around, it is used to tell us the root multiplicities of m. 
This is its direct use in the proof of the Monstrous Moonshine conjectures. 
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Unfortunately, the numerator of the Weyl character formula for L(A) rarely has a 
product formula. However, certain specialisations of the numerator can manifestly equal 
certain (A-dependent) specialisations of the denominator, and thus inherit the product 
expansion of the latter. Consider a simple example: any finite-dimensional A,,-module 
L(A) has a character satisfying 


Jy, — xiy: 
choto =D TT Tease (3.4.8a) 
Jens agt 
1l<i<j<n+l y y 
for any t € C, where x =e‘ and y; = exp[(i — D àj)t]. Similar formulae hold for 
all Kac—Moody algebras [374]. In particular, from these we obtain instantly Weyl’s 
dimension formula for finite-dimensional semi-simple Lie algebras: 


(a@|A +p) 


a>0 


(3.4.8b) 


3.4.3 Automorphic products 


In Section 2.4.1 we explain the important notion of lifting a modular form f : H —> C for 
a discrete subgroup T of G = SL2(R). The result is an automorphic function ¢ : G > C 
obeying the transformation (2.4.2b). 

Borcherds discovered an unexpected way to lift (meromorphic) modular forms for 
discrete in SL2(R) to much larger Lie groups. His starting point was (3.4.7a), where 
the coefficients of a modular function appear in the exponents of a product expansion. 
In hindsight, another example of this phenomenon is the product formula(2.2.6b) for n: 


[o0] 
n(t) =] [0 -4"}, (3.4.9) 
n=1 
where the powers ‘1’ are the coefficients of the q -expansion of the modular form 63(t)/2. 
Moreover, both (3.4.7a) and (3.4.9) are the denominators of the Monster algebra m 
and the affine algebra u” (recall (3.2.12c)). Are these hints of a much more general 
phenomenon? 
Indeed. Borcherds found a far-reaching generalisation of (3.4.7a): 


Theorem 3.4.2 [76] Suppose f(t) = }_„anq” is a meromorphic modular form for 
SL2(Z) of weight —s /2, holomorphic in H (so its only possible pole is at the cusp), and 
with integer coefficients an. We require s = 0, 8, 16, ...; ifs = O we also require that 24 
divides ag. Let vo € Rt}! be a generic vector of negative norm. Then there is a unique 
lattice vector p € 11,44, C R+}! such that 


Fosse || -@eeryrer (3.4.10) 


rélls4i1, r:v0>0 


can be analytically extended to a meromorphic modular form on Hs541,1 of weight ao/2 
for the group Os42.2(Z)°. 
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Since f in Theorem 3.4.2 has nonpositive weight and is holomorphic in H, it will 
necessarily have poles at the cusps QU {ico} (unless it is constant). The set 1/544) 
is the unique even self-dual lattice of signature (s + 1, 1) (Section 1.2.1). O,+2,2(R) is 
the group of (s + 4) x (s + 4) matrices A with real entries, which obey ADA‘ = D for 
D = diag(1,..., 1, —1, —1). By a modular form for Os+2,2(Z)" , we mean the following. 
First, the imaginary norm vectors in R+}! lie in two disjoint cones; denote by C the cone 
containing —vo. The analogue of the upper half-plane H is here the set H,.1,, C C'H! 
consisting of all vectors v with imaginary part Im(v) € C. Then 


F+a)=FQ), VAC Mya, (3.4.1 1a) 

F(w(v) =£F), Vw € Aut(Iy411)*, (3.4.1 1b) 
2v v © V\ 40/2 

F (=) = (=) F0), (3.4.11c) 


for appropriate choice of signs, where Aut(/ /;41,;)* are the automorphisms of the lattice 
T1;41,1 that send the cone C to itself. The transformations on H,+1,1 given in (3.4.11) 
generate a subgroup of O,+2,2(Z), denoted O,42,0(Z)*. Now F can be lifted to the Lie 
group O,42,2(R)* in the usual way. This lifting of a modular form for a subgroup F of 
SL,(R) to automorphic forms for O,+2,2(R)* is called a Borcherds lift. 

Of course (3.4.7a) is recovered from taking f(t) = j(t) — 744; then s = 0, and the 
real Lie group O2 2(R) is essentially SL: (R) x SL2(R) — that is, they share the same Lie 
algebra (recall Theorem 1.4.3) — with each SL2(R) contributing a copy of H and SL(Z). 

We can recover from F more familiar modular forms by restricting the domain of F 
to multiples tv of imaginary norm vectors v in //,,1,,. For example, we get: 


Theorem 3.4.3 [76] Let f(t) = X? _~ ang” be any meromorphic modular form for 


r(4), holomorphic in H but possibly with poles at the cusps, and with integer coefficients 
an. We require a, = 0 unless n = 0, 1 (mod 4). Then for some choice of h € Z/12, 


F(t) =q"| [0-a 


n=1 


is a meromorphic modular form of weight ao, with all poles and zeros at cusps. 


For example, (3.4.9) (or rather its square) is recovered by taking f(t) = 63(2T). Modular 
forms for SL arise here because O;,2(R) is essentially SL2(R). 

In this section we find several examples of product expansions of modular forms, 
Jacobi forms, etc. coming from the denominators of characters. An exciting development 
is provided by Gritsenko and Nikulin [264], [265]. Given any hyperbolic Kac—Moody 
algebra of rank n > 3 with certain properties (making them close in spirit to semi- 
simple Lie algebras), there exists a Borcherds-—Kac—Moody algebra of the same rank 
with identical real roots (hence Weyl group, which will be a subgroup of O,—1,1(R)), 
but with precisely the imaginary simple roots needed so that its denominator is an 
automorphic form for O,,.2(R). It is reminiscent of Macdonald’s identities: he found he 
needed to introduce extra factors to get modularity (namely the third product in (3.4.7b)), 
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and we now interpret those as due to the imaginary roots of the corresponding affine 
algebra. 

Most Borcherds—Kac—Moody algebras are of course not interesting; those that are (e.g. 
the Monster and fake Monster Lie algebras) have automorphic denominator identities. 
Thus this provides a systematic construction of what should be interesting Borcherds— 
Kac—Moody algebras. It is known that there are only finitely many such hyperbolic 
Kac—Moody algebras, and so this is a finite family of Borcherds-Kac—Moody algebras. 
Clearly, we should study their representation theory, and compute the characters of their 
‘interesting’ (presumably integrable) modules. In analogy with affine algebras, we may 
hope that the numerators of those characters will also be automorphic. 

Relations of these automorphic forms with mirror symmetry and string theory are 
beyond this book, but see, for example, [266], [342], [275], [276], [434]. The review 
article [358] is a good treatment of many of the topics of this subsection. 


Question 3.4.1. Let f(¢) = „o nq”, with ao = 1. Verify that, at least formally (i.e. 
without any regard to convergence), this can be written as f (q4) = [2,0 — q”) for 
some unique numbers b,. If all a, are integers, then so are all by. 


Question 3.4.2. Prove (3.4.8a) and the Weyl dimension formula (3.4.8b) for sth. 


Question 3.4.3. Express the character x, of any integrable representation 4 of A,“), 
specialised appropriately, as an infinite product. 


4 
Conformal field theory: the physics of Moonshine 


This chapter presents the physical context for Moonshine. Rather than diving into a 
conventional discourse of conformal field theory (CFT), it might be more helpful to 
take several steps back and begin with Galileo. Physics even more than mathematics is 
interwoven with history. Our treatment of CFT is sketchy but should supply the reader 
with all that is necessary to appreciate the absolutely profound role physics has played in 
Moonshine and other aspects of ‘pure’ mathematics in recent years. It is hoped that this 
chapter will make it easier for the interested reader to pursue more standard treatments 
of CFT and string theory. It is written primarily with the mathematician in mind. 

The third section explores the physics of CFT, and the fourth describes some mathe- 
matical formulations. CFT is to a generic quantum field theory what finite-dimensional 
semi-simple Lie algebras are to generic Lie algebras. Background for both sections 
is provided by the review of classical and quantum physics sketched in the first two 
sections. 

For a mathematician studying physics, important to keep in mind is that physics has 
been driven historically more by its predictive power than by conceptual concerns (with 
a few remarkable exceptions, such as Einstein’s general relativity). Given enough time, 
however, the theory becomes polished to a state of pristine mathematical elegance, as 
classical mechanics amply demonstrates. In particular, one has the sense that quantum 
theory is ad hoc and rather unsound — and it is both — but these features are due to 
the historical accident that we were born too close to its inception. Much more impor- 
tant is what it can teach mathematics, which is considerable. The essence of quantum 
field theory is completely accessible to mathematicians and, as mathematics of the late 
twentieth century shows, should at least in its broad strokes be part of their standard 
repertoire. 

A special feature of classical physics is that the behaviour of a system — for example, 
its trajectory in phase space — becomes much simpler when looked at infinitesimally. 
The simple universal regularities are captured by differential equations; the complicated 
incidental features of a specific situation are relegated to the initial conditions. Among 
mathematicians, this central role of partial differential equations in classical physics was 
responsible for what had been a near-identification of their study with the subject they 
call mathematical physics. It was largely with the arrival of string theory that a much 
richer range of mathematics became relevant to physics, and it is this happy development 
that made this book possible. 

Almost every facet of Moonshine fits comfortably into CFT, where it often was discov- 
ered first. Some have questioned though the necessity of involving such a complicated 
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beast, or the closely related “vertex operator algebras’ of the next chapter, in our mathe- 
matical explanation of Moonshine. Although CFT has been an invaluable guide so far, 
they would argue, perhaps we are a little too steeped in its lore. Undoubtedly there is 
truth in this, but CFT still has new insights to share. It is an integral part of Moonshine’s 
future as much as its past. Sections 4.3 and 4.4 are central to the whole book. 


4.1 Classical physics 
4.1.1 Nonrelativistic classical mechanics 


Temporarily forget what you know of physics. One of the most blatant empirical facts 
must be that anything in motion on Earth eventually slows to a stop. On the other hand, 
stars and planets clearly behave otherwise, therefore earthly laws can’t apply directly to 
the Heavens. Those observations are fundamental to Aristotelian physics. The starting 
point, however, for classical physics is Newton’s First Law: the remarkable thought (due 
to Galileo, 1632) that anything anywhere will continue to move in a straight line and at 
constant speed, unless something (by definition a force) acts on it. Although in isolation 
it has no real content, it presents a powerful strategy for analysing Nature. For example, 
to first approximation the Moon travels in a circle about the Earth; rather than trying to 
conceive of some strange mechanism responsible for pushing or dragging the Moon in 
its nonlinear orbit, the First Law instead leads us to imagine some ‘force’ that always 
pulls the Moon towards the Earth. This second possibility is much more promising of 
course, and led Newton to his theory of gravitation. 

Classical mechanics describes systems with finitely many degrees of freedom. The 
configuration (snapshot, instantaneous state) of a classical system at an instant ¢ of 
time can be identified with the precise values of all degrees of freedom (e.g. position 
coordinates) at that time. The basic challenge is to predict the configuration at later 
times. This amounts to setting up and solving a system of differential equations, called 
the equations of motion of the system. 

Consider a system of N particles, with positions x; = (Xj1, Xj2, Xj3). The 3N degrees 
of freedom are the position coordinates x;;. The equations of motion, which determine 
the trajectories of the N particles by giving their response to the stimulus, are 

d 
Mi aq =F,, (4.1.1) 


where F; is the net force experienced by the ith particle and the proportionality constant 
m;i is called its mass. Dots are used to denote time derivatives: for example, velocity is 
x and acceleration is X. Note that (4.1.1) is compatible with Newton’s First Law. 

In general the force F; can be a function of all positions x ;, velocities v j and time t — 
for example, air resistance is approximately proportional to Vv. We will restrict attention 
to the typical ones (from which can be derived all others), which are of the form 


(F;); = — V(Xi,..., Xn) 


OXij 
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Fig. 4.1 The harmonic oscillator. 


Fig. 4.2 Singular motion of five gravitationally interacting particles. 


for some real-valued function V called the potential. These are called conservative forces 
because they conserve (keep constant) energy. The potential has units of energy, and the 
sign is introduced so that V contributes positively to total energy. In quantum mechanics 
the potential V is more fundamental than the force F. 

For example, Newton’s gravitational potential is V = — }°,_ jc ray P where G is a 
positive constant. Einstein found it profoundly significant that the gravitational ‘charge’ 
mi here is numerically (though certainly not conceptually) identical to the ‘inertial’ mass 
mi in (4.1.1) (see Section 4.1.2). 

For a one-dimensional example, consider a harmonic oscillator — for example, the 
spring in Figure 4.1. Hooke’s Law says that the force F = —k (x — xo), where k is a 
positive constant and xp is the resting length of the spring. Hence —k (x — x9) = mX, so 


k k k 
x = Xo +a cos (=) +b sin (=) = xo + A cos (Vr=2) . (4.1.2) 
m m m 


This force is conservative, with potential V = 3k (x — xo). This elementary system 
is fundamental to theoretical physics, as it describes small oscillations about stable 
equilibrium states (i.e. points at which all forces F; vanish). Indeed, if dV /dx vanishes 
at x = xo, for some potential V, then the Taylor expansion of V (x) would begin like 
do + a(x — xo)”, and so it would behave like a harmonic oscillator. We encounter the 
harmonic oscillator repeatedly in the following pages; in classical field theory these 
humble oscillations describe, for example, sound waves, and in quantum field theory 
they are the particles. 

The mathematical difficulties faced by quantum field theory are notorious, but remark- 
ably singular behaviour occurs in classical mechanics as well. For one example, con- 
sider five point particles interacting gravitationally, positioned as in Figure 4.2. Particle 5 
moves horizontally between the orbiting pairs 1 and 2, and 3 and 4. It is possible [485] to 
arrange for particle 5 to zip back-and-forth between those pairs, picking up speed, until 
in a finite time it reaches infinite speed without ever colliding with the other particles. 
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Many other examples of singular behaviour in classical mechanics are possible [485]; it 
is not known yet how typical they are among all possible motions. 

Later in this section and the next, we touch on other mathematical difficulties plagu- 
ing our physical theories. Generally speaking, these difficulties of classical and quan- 
tum physics have to do with probing space to arbitrarily high precision. Whenever we 
push scientific theories far beyond their established realm of reliability, our arrogance 
inevitably gets us punished.' The infinitesimal structure of space and time is surely 
such an unjustified speculative extrapolation. Unfortunately, all our physics is built on 
it. It is tempting to guess that when we understand how the illusion of a macroscopic 
four-dimensional space-time continuum arises from more fundamental concepts, these 
mathematical difficulties should become more tractable. 

We know from our childhood that global properties can arise from second-order 
differential equations (‘The shortest distance between two points is a straight line’). 
Hamilton’s principle says that the solution to the equation of motion mx = —<v, 
subject to the boundary conditions x(t1) = x1, x(t2) = x2, is the path t > x(t) obeying 
the given boundary conditions, for which the action 


S := T (5 xy — viv) dt (4.1.3) 


is stationary (minimal if |x; — x2| and |t; — t2| are both small). The integrand is called 
the Lagrangian L = T — V, where T = imi? is the kinetic energy. The combination 
T + V for the stationary path x(t) will be independent of the time f, and is called the 
energy. Historically, a hard lesson to learn (even for men like Gauss and Hertz) was that 
energy is an abstract mathematical notion and not a measure of some physical quantity 
(see the excellent discussion in chapter 4, vol. I of [188]). 

This observation leads to a formulation of classical physics called Lagrangian mechan- 
ics, which will be central to our discussion of quantum field theory in Section 4.2 (in 
quantum theory concepts like force, velocity and acceleration cease to play fundamental 
roles). The possible configurations of our physical system can be regarded as forming 
a manifold, called the configuration space M. For example, for a rigid body such as 
a potato, the configuration space is R? x SO3(R) = R? x P? (@R):R? gives its centre-of- 
mass, and P3(R) its orientation. The behaviour of a system is regarded geometrically 
as a parametrised path t +> q(t) on M, called the trajectory. Let q; be a complete set 
of local coordinates on M, obtained by restricting to some open set Uy C M (recall 
Definition 1.2.3). The q; represent the degrees of freedom of the system. The Lagrangian 
L =T — V isa function of q; and q; — that is, a function on the tangent bundle T M. 
In particular, in order to capture the kinetic energy T , which usually will be quadratic in 
the g;, we typically want M to be Riemannian, with T proportional to the norm-squared 
å - à. The potential V will be a differentiable function on M. The equations of motion 


' Examples abound. There is, for instance, the famous remark of Lord Kelvin in 1899 that all of physics has 
been finished. Socrates’ theory near the end of Phaedo as to the nature of the Earth makes a merry read. In 
mathematics recall the humbling experiences of Russell’s Paradox and Gédel’s Incompleteness Theorem. 
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in the coordinate patch U, are the Euler—Lagrange equations 


d ( OL ) OL 
—|— ]= k (4.1.4) 
dt \0qi aqi 
which say that the action (4.1.3) is stationary for the physical solutions q; (t). Equation 
(4.1.4) is obtained from the calculus of variations by varying qi. 

To solve a physical system in Lagrangian mechanics, the first task would be to choose 
good local coordinates q; on the configuration space M, then to express the kinetic and 
potential energies in terms of q; and ġ;, and finally to write down and solve the corre- 
sponding partial differential equations (4.1.4). Lagrangian mechanics (and Hamiltonian 
mechanics, to be discussed shortly) are essentially equivalent to Newtonian mechan- 
ics (4.1.1). Their appeal though should be clear to any mathematician: by freeing the 
formulation from adherence to a specific choice of coordinates, the formal structure of 
classical mechanics becomes more evident. This is especially valuable when extensions 


of the theory are needed — for example, when handling enormous numbers of particles 
in statistical mechanics, or when we were struggling to obtain the laws of quantum 
mechanics. 

Returning to the harmonic oscillator, take q = x — xo. Then L = T — V = im ġ?— 
5k q? and the Euler-Lagrange equation (4.1.4) yields the differential equation mg = 
—kq. The configuration space is R, and trajectories consist of segments [—A, A] 
traversed periodically. Energy T + V = 5k A? is constant on each trajectory. 

The pervasive habit of writing physical quantities with ‘units’ (metres, seconds, .. .) 
leads us into thinking of those mysterious entities as real and indispensable. In fact, 
many would regard as profound, or at least meaningful, the following question: What 
is the number of fundamental units in physics? However, Lagrangian mechanics should 
have led us to a somewhat more sophisticated understanding of units. Units themselves 
have no fundamental significance; choosing units is a special case of selecting a coor- 
dinate patch on the configuration space (together with a choice of time parameter). The 
common and useful practise of rejecting or anticipating formulae based on unit consider- 
ations (‘dimensional analysis’) merely captures some homogeneity information stored in 
the Lagrangian, and is the analogue here of the conservation laws of the following para- 
graphs. In particular, suppose we’ve selected a coordinate patch g : U > R",q > (qi), 
and we want to change the scales (i.e. units) on each coordinate axis (which as expres- 
sions of nationalistic pride is fairly common). That is, we choose nonzero constants À; 
and consider the rescaling q; +> q; = Aig; of local coordinates, as well as t œ> t’ = dof. 
This has two consequences. Firstly, we can write locally L(q/,q;', t) = L'(qi, qj. t), 
that is, we can continuously deform the Lagrangian. Inevitably, some choices of units 
will simplify L and hence ease the resulting arithmetic. Secondly and more importantly, 
it typically will be possible to absorb the rescalings 4; into the various ‘physical con- 
stants’, that is, the parameters in L, which will tell us invariance properties of L and 
hence of the equations of motion (4.1.4). This is how to obtain the convenient and well- 
known meta-theorem that says the units of each term of any physical expression should 
agree. 
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For example, note that the harmonic oscillator Lagrangian is invariant under the rescal- 
ings gh Aig, th Apt, k œ> Ark and m +> AZAT M; we see that each term of the 
solution (4.1.2) has a well-defined and consistent scaling behaviour (as they must). Also, 
for a preferred choice of A;, the Paprangan ee to g? — q?. For another example, 
note that the gravitational Lagrangian L = sm a? + a + %2) +G” nM j is invariant under 
the rescaling x; œ> Àx;,t > Aot, provided m rescales like à~ am and GM rescales like 
ie. In all cases, this scaling behaviour can be taken as defining the ‘units’ of the cor- 
responding quantity — our definition here that the units of L be trivial differs from the 
usual one (where L has units of “energy’), but this is merely a matter of convention. 

This discussion should lead us to suspect that other invariance properties of L may 
yield other ‘meta-theorems’, generalising in a way the dimensional analysis. Indeed that 
is beautifully the case. By a symmetry of our system, we mean a diffeomorphism a of 
the configuration space M respected by the physics: 


L(a(q), &(G)) = Lq, å), 


where @&(q) is the induced map (derivative) on the tangent space with ith component 
Me j ee —“q;- Note that, unlike the rescalings considered in the previous paragraph, here 
we're te that L and hence all the physical constants be unchanged by œ. Then 
q(t) is a possible trajectory (i.e. a solution of (4.1.4)) iff æ (q (t)) is. 

Now, suppose we have a continuous family a; of symmetries, that is a one-parameter 
subgroup s +> gs in the Lie group of symmetries. This symmetry can be used to vary 
the coordinates q;, qj; — and hence the action S (4.1.3) — infinitesimally. What does 
Hamilton’s principle (êS = 0) tell us here? The answer (Noether’s Theorem’) is remark- 
able: continuous symmetries yield conservation laws! Define the quantity (‘charge’) 


OL [ 0a;(q) 
Ge = ( st )e R. 


This expression is meaningful because the ‘generalised momentum’ p := Zi is a section 


of the cotangent bundle T*M, while the derivative aie of the path a;(q) (q fixed) 
defines a section of the tangent bundle TM. Less formally, suppose œs sends q; to 
qi +s fig, q, t), keeping only first order in the parameter s; then @, sends g; tog; + s ah ; 
to first order, and Q = )°, p; fi. In either case, an easy calculation from (4.1.4) shows 
that Q is constant along each trajectory, that is Q is ‘conserved’. (A deeper reason for 
this is that the Poisson bracket (4.1.6a) gives the space of solutions to (4.1.4) a symplectic 
structure.) 

For example, the gravitational potential V = =O A is invariant with respect to 
translations s(x) = x + sa for any fixed vector a 2 R The charge Q here is a-p 
where p is the ‘total momentum’ mj 7+ xı + my, Varying a, we find that momentum 
is conserved. We could say that the sndependence of the physics on absolute position 


2 As is typical, this designation is a little unfair: Noether published this in 1918, but Jacobi already knew in 
1842 the connection between translation symmetry and momentum conservation, and rotational symmetry 
and angular momentum. 
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implies conservation of momentum. Likewise, independence of the physics on absolute 
time implies conservation of energy. In classical mechanics, Poincaré showed that all 
conservation laws are due to an underlying symmetry: if Q is conserved, then the Poisson 
bracket {Q, q}p of (4.1.6a) generates the corresponding symmetry. What is fundamental 
here isn’t the Lie group action on T.M, but rather the infinitesimal generators (Lie algebra 
action), which need not be derived from a Lie group symmetry. 

Another formulation of classical physics, useful for extensions to statistical and quan- 
tum mechanics, is Hamiltonian mechanics. Recall the generalised momenta p; = aa 
Together, the variables q;, p ; parametrise a 2n-dimensional manifold, the cotangent bun- 
dle T*M, called phase space. The Hamiltonian H(q;, p j) is the quantity °; pig; — L, 
expressed in variables q;, p;. Typically, it equals the total energy. The equations of motion 
here, obtained by varying both q; and p;, are Hamilton’ s equations: 

oH 0H 

gi = TA bi = TA (4.1.5) 
that is 2n first-order differential equations, rather than the n second-order differential 
equations of Lagrangian mechanics (4.1.4). Although Hamiltonian mechanics is not 
always equivalent to Lagrangian mechanics, it is for typical systems. Because Hamilton’s 
equations (4.1.5) are first-order, the configuration of the physical system at any time t 
is uniquely determined by the point in phase space it occupies at a given instant fo. 
Thus phase space serves as a moduli space for physics. A more careful treatment of 
Hamiltonian mechanics requires the language of symplectic geometry — see, for example, 
[15] for details. 

In classical mechanics the observables, that is the physically measurable quantities 
such as position, momentum or energy, are by definition real-valued smooth functions 
A(q, p) on phase space. It is through the observables that a physical theory is compared 
to experiment. The observables C°(T*M) form an infinite-dimensional Lie algebra, 
with bracket (in local coordinates) given by the Poisson bracket 


dA OB dAOB 
{A, B}p := Xo ( ) (4.1.6a) 
7 Ogi ODi Opi aqi 
(see Question 4.1.2). Then Hamilton’s equations (4.1.5) imply 
dA 
q Athe (4.1.6b) 


where on the left A is evaluated on a trajectory (q(t), p(t)). The term ‘first integral’ 
refers to any observable that is constant along each trajectory; the first integrals form a 
Lie subalgebra of dimension < 2dimM in the observables C (T * M). Equation (4.1.6a) 
may seem obscure, but it is essentially equivalent to the natural bracket [X, Y ] of vector 
fields on a manifold — see corollary 5, page 217 of [15] for details. As we see in the next 
section, algebra arises in quantum field theory through the analogue there of Poisson 
bracket. 

For example, recall the harmonic oscillator. The generalised momentum p = mġå is 
the usual momentum. The Hamiltonian H = x p + 5kq? is the energy. Hamilton’s 
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equations tell us ġ = p/m and p = kq. Phase space is the plane R?, with ellipses as 
trajectories. The basic Poisson bracket {g, p}p = 1 says the observables q, p, 1 span 
Heis (recall (1.4.3)). 


4.1.2 Special relativity 


The fundamental theoretical advance of the nineteenth century was Maxwell’s electro- 
magnetism (Section 4.1.3), which unified light, electricity and magnetism. Although both 
Newtonian mechanics and Maxwell’s theory were enormously successful, they were in 
some conflict. For instance, in Maxwell’s theory is obtained the formula 


c := speed of light = : ; 

€ko 

where €o, uo are numerical constants associated with the vacuum. This seems to suggest 
that the speed of light is itself a constant, independent of the observer. However Newton — 
and common sense — would have us believe that the speed at which light, or anything 
else, travels is variable. If light is emitted from a headlight with speed c, and a bug 
approaches the oncoming car with speed v, then surely to it that light travels with speed 
v+c. 

The standard resolution in the nineteenth century was to regard Maxwell ’s equations 
as valid only with respect to a substance called the aether. The aether would be the stuff 
in which light-waves wave (propagate) — it would be to light what air is to sound. This 
aether concept was getting increasingly awkward as the century turned. Einstein’s act of 
genius here was to flip the logic and trust Maxwell’s message. Thus, the speed of light 
is the same for all observers: the light from that approaching car strikes the bug with the 
same speed c it left the headlights. Special relativity consists of the modifications this 
message implies for Newtonian physics. Indeed what we call magnetism can be thought 
of as a relativistic correction to the electrostatic force; Maxwell’s electromagnetism was 
the first relativistic theory, created years before Einstein’s birth. 

The word ‘special’ in ‘special relativity’ arises because the equations are simplest 
and fundamental only for a certain class of privileged observers called ‘inertial’ — 
uniformly moving observers for which Newton’s First Law holds. A car rounding a 
corner is certainly not inertial, but a coasting isolated spaceship could be treated as one 
to good approximation. Special relativity also applies to accelerating observers, provided 
one works infinitesimally. Physically speaking, general relativity (Section 4.1.3), which 
removes this preferential treatment of inertial observers, is a mathematically elegant 
global integration of the equivalence principle and locally applied special relativity. 

An inertial observer is simply a choice of fixed basis in R4; the coordinates (x, t) 
with respect to this basis, of a point (‘event’) x in R4 (‘space-time’), have the physical 
interpretation to that observer as space and time coordinates. Not every choice of basis 
is permitted: we require them to be orthonormal in the sense that the straight-line tra- 
jectory (‘world-line’) (x(t), t) traced in space-time R* by a beam of light is required to 
satisfy (x(t) — x(0)) - (x(t) — x(0)) = ct? — this is what we mean by the speed of light 
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being constant. Thus we are led to endow space-time R* with the indefinite Minkowski 
metric n = (nav) = diag(1, 1, 1, —c’). We write x? for x-x = eee 
x= S a X„Xv. Basis transformations between inertial observers belong to the Lie 
group O3(R). As mentioned in Section 1.4.2, it has four connected components; the 
component containing the identity is the Lorentz group SO; R). Its universal cover 
SL2(C) and their semi-direct products with translations R* (the Poincaré group and its 
double-cover) also arise in physics. Thus in special relativity space and time are coupled, 
just as in Euclidean geometry the x, y, z coordinates are coupled (i.e. their independent 
objective significance is denied). The disturbing dissimilarity between our qualitative 
experiences of time and space is ignored by Einstein’s theory. Discovering what relation 
this dissimilarity has to the different signs in the metric, or to the apparent magnitude 
of c, clearly should be a fundamental task. By contrast, in Newtonian mechanics space- 
time R* factorises globally as R? x R, and the basis transformations are taken from 
O3(R) x {£1}. 

That Maxwell’s equations are invariant under the Lorentz group was known before 
Einstein. Einstein’s contribution was to interpret the Lorentz group as giving the trans- 
formation of physical space and time. For example, the space-time transformation A 
between two observers with parallel spatial coordinate axes but travelling with uniform 
relative velocity v = (v, 0, 0), according to Einstein and Newton, is 


XuXvNuv and 


mae ee 
0 1 0 0 
A= E i , (4.1.72) 
v/e? 0 0 1 
1—v?/c? a/ 1—v?/c? 
100 v 
0 1 0 0 
Bee naed: (4.1.7b) 
0 00 1 


respectively. Note that in the limit c > oo, (4.1.7a) tends to (4.1.7b). Physically, matrix 
(4.1.7a) says that the lengths of moving objects shrink, and their clocks run more slowly. 
This is not some illusion, optical or otherwise. For example, the muon is an unstable 
elementary particle with an average lifespan of 2 x 1076 seconds when at rest. When 
travelling at speed v, it will last on average 2 x 107°/,/1 — v2/c? seconds. It will travel 
further than it would have if (4.1.7b) had been the correct transformation, and because 
of that will be able to participate in interactions that would have been too distant for a 
muon behaving nonrelativistically. Other physical quantities transform similarly — for 
example, the parameter m playing the role of relativistic mass equals mo/./1 — v?/c?, 
for some constant mo called rest-mass. Now, expand this out using the binomial series: 
2 3 vt 


v 
m = mo + zmo-5 + 5m 
c 


2 AS 
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Multiplying by c?, we recognise the second term as kinetic energy and we are led to 
suspect that mc? is the relativistic analogue of kinetic energy — that is, E = mc? for a 
free particle.’ 

In order to compare observations, we need to understand how the physical quantities 
change when we switch inertial observers, that is, how they transform with respect 
to the Lorentz group SO; (R). Typically, they transform like matrix entries of SO} - 
representations. For example, the 4-vector (x, t) transforms with respect to the defining 
representation of the Lorentz group, as does the energy-momentum 4-vector (p, E /c”), 
and thus its Minkowski norm-squared p? — E*c~? is an observer-independent quantity 
(a Lorentz scalar) and equals —mic’?. It is conventional to denote with superscripts the 
components of any such 4-vector: for example, (x, t) = (x!, x7, x3, x4). 

Writing equations of motion presents us with a challenge: in Newtonian physics we 
always want to differentiate or integrate with respect to time; however, relativity teaches 
that we shouldn’t treat time distinctly from the spatial coordinates. Moreover, ‘dt = dx” 
transforms like a component of a 4-vector, which isn’t necessarily what we want. The 
solution is that the infinitesimal norm-squared dx? — c?dt? =: —c?drt? is O3 ;-invariant, 
defining the ‘proper time’ t, and so we should differentiate/integrate with respect to T. 
Physically, t is the time coordinate in the (usually only infinitesimally inertial) reference 
frame in which the particle is at rest. The Lagrangian L is a Lorentz scalar, and the action 
(4.1.3) becomes f L dt. For example, the Lagrangian for a free particle (‘free’ means 
no forces act on it, so the potential V is 0) can be taken to be 


1 dx\?  , dt? 
L = -mọ c 
2 dt dt 


The Hamiltonian, being energy, transforms like time. 


But what if there are several particles: which proper times t; do we use? The t for the 
centre-of-mass, perhaps? In fact, this is a serious problem. The ‘No-Interaction Theorem’ 
(beginning with [124]) says that there can be no direct Lorentz-invariant interaction 
between particles, except through forces localised at a point causing an instantaneous 
change of velocity that don’t change the number of particles. As there do seem to be 
unstable elementary particles (e.g. the muon) and gravity for instance isn’t localised to a 
point, we have a problem. The obvious solution is to copy the first relativistic interaction 
theory, namely Maxwell’s, and use fields (Section 4.1.3). 

Special relativity says that the speed of light is fundamental to space-time. Modern 
physics helps us to accept this seeming glorification of light, by saying that there is a 
special speed c, and any particle with zero rest-mass mg (such as the photon, which 
mediates light) will always travel at that speed. But perhaps more can be said. Surely 
space-time is not a fundamental physical quantity; eventually it will be recognised as a 
fairly macroscopic epiphenomenon, and it will be understood how it arises operationally. 


3 The equivalence of matter and energy was proposed 50 years before Einstein, by Mendeleev, the father of 
the periodic table. Although his reasons were correct, his proposal was ignored and forgotten. 


236 Conformal field theory 


For instance, we can measure distance using rigid bodies called metersticks and time 
using quartz watches, but both this rigidity and periodicity are electromagnetic phe- 
nomena. Perhaps the constancy of the speed of, for example, light will be understood 
ultimately as a reflection of this circularity. 

Einstein found the special treatment of inertial observers quite artificial. But it seems 
that accelerating observers can experience interesting phenomena. For instance, consider 
an observer S standing at the North Pole and an inertial observer T hovering above her, so 
T watches S uniformly spinning at the rate of one cycle every 24 hours. Let’s assume for 
simplicity that the Earth’s equator is a perfect circle; to T , the ratio of its circumference 
to the diameter of the Earth at the equator should be 2. However, if S was to measure 
precisely the circumference and the diameter, she would find their ratio for this ‘circle’ 
to be (very slightly) greater than 7. The reason for this is because S’s observations must 
be consistent with T’s: (4.1.7a) tells us that lengths parallel to the motion (such as S’s 
metersticks along the equator as seen by T) will dilate by some factor ,/1 — v?/c?, while 
lengths perpendicular to the motion (e.g. the diameter) will remain unchanged. Likewise, 
S will find that her wristwatch will tick more quickly than a clock placed on the equator, 
even though both are at rest relative to her. Thus both geometry and physics change for 
non-inertial observers! (For a fairly convincing argument that gravity requires curved 
space-time, see section 7.3 of [422].) 

In fact relaxing the inertial observer restriction provided Einstein with the key to 
his remarkable explanation of gravity. As mentioned earlier, the gravitational ‘charge’ 
numerically equals the mass m seen in formulae such as F = ma or T = sm v? — this is 
precisely what Galileo’s Pisa experiment was designed to verify. There are other ‘forces’ 
with this same property, for example the pull we feel when riding a merry-go-round. This 
got Einstein thinking: perhaps gravity is as fictitious as a centrifugal force? When we 
are in free-fall — whether in an orbiting spaceship or in an elevator suddenly decoupled 
from its cable — it is as if we are free of gravity, much as we are suddenly free of the 
centrifugal force when we step off the merry-go-round. This is the equivalence principle, 
which constitutes the only new physical content of general relativity. We are led to the 
thought that the gravitational ‘force’ experienced while sitting in a chair isn’t due to 
the matter in the Earth pulling us towards it, but rather merely a consequence of the 
chair interfering with our natural inertial motion, just as does a car rounding a corner. 
All observers are physically valid, but awkward choices (such as me in a chair or in a 
turning car) introduce fictitious forces such as gravity. Everything tries to move in as 
straight a line, and with as constant a speed, as possible (at least if it’s not under the 
influence of a true force like magnetism); that astronomical effect we call ‘gravity’ is 
merely a consequence of the fact that ‘straight’ has only a local significance. Space-time 
is not the vector space R*, but rather a nontrivial (curved) four-dimensional pseudo- 
Riemannian manifold. Gravity is the convergence or twisting of nearby geodesics; what 
we perceive as the elliptical revolution of the Earth about the Sun is merely the gentle 
entwining of the Earth’s geodesic with the Sun’s (Figure 4.3). General relativity, which 
we discuss briefly at the end of the next subsection, makes these thoughts mathematically 
precise. 
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Fig. 4.3 The revolution of the Earth about the Sun. 


4.1.3 Classical field theory 


In physics, a ‘field’ is as in ‘vector field’ rather than ‘number field’. It means a function 
of (usually) space-time, or more precisely a section of some vector bundle whose base is 
space-time. The most familiar example is Newton’s gravitational field, namely the grav- 
itational potential V (x, t). Another example is Maxwell’s electromagnetic field F(x, t), 
which is matrix-valued. 

Until now, we’ve been interested in particle dynamics, and the fields were auxiliary. 
To analyse how object A gravitationally influences object B, we first calculate how A 
influences the gravitational field, and then how the gravitational field influences B. In 
classical field theory, the field is a mechanical system in its own right — for example, 
it carries energy much like a fluid. It allows us to avoid the No-Interaction Theorem of 
relativistic dynamics. In quantum field theory discussed in the next section, the field is 
primary and the particle becomes an auxiliary phenomenon called a quantum, apparent 
only asymptotically. 

A cherished physical principle, going back at least to Faraday, is called locality. The 
idea is that the only way we can directly affect something, is by nudging it. In order to 
influence something not touching us, we must propagate a disturbance from us to it, such 
as a sound-wave in air or a ripple in water. Special relativity sharpened locality into the 
requirement that no disturbance or influence can travel faster than light, so that space- 
time points (x, t), (x’, t’) that are space-like separated (i.e. obey (x — x’)? > c? (t — t^”) 
are causally independent. As Faraday himself noted, locality leads to the concept of 
field. This is the main purpose for both classical and quantum fields — they provide a 
natural vehicle for realising locality. 

Before, configuration space was finite-dimensional, with coordinates (q1, ..., qn). 
Now our coordinates have a continuous index, qx = g(x), and configuration space 
is a space of functions. The Lagrangian in particle dynamics looks like )°; T; — 
DD nj Vii. Now the sums are replaced by integrals and the Lagrangian becomes 


4 Strictly speaking this isn’t a consequence of relativity, and in fact some physicists have entertained the 
possible existence of particles (‘tachyons’) that travel faster than light. These would behave curiously (e.g. 
they slow down the more energised they become), but like us they would require infinite energy to 
reach the speed of light — sadly, once a tachyon, always a tachyon. The difficulties facing the existence of 
tachyons are causality paradoxes. If P and Q are two space-like separated events, then there are reference 
frames in which P occurs before Q, and others in which Q occurs before P (why?). Hence if we had a gun 
that shot tachyonic bullets, then to some observers our victim would die before we pulled the trigger. 
Though not a logical contradiction, it is distinctly odd. Almost all physicists dismiss tachyons and 
faster-than-light influences as science fiction. 
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L = f f f £Ldx dy dz for some function £ called the Lagrangian density. £ is a function 
of the fields (x, y, z, t) and their partial derivatives 0,@, etc. (together with contribu- 
tions from particles). In field theory, £ is more elementary and fundamental than L. 
Locality takes the form here of requiring that £ only involves one space-time point. For 
each field ° there is a field equation 


3 ƏL Iy ð aL L 
at 8(0,¢%) Ax; IBH — aG*” 


(4.1.8) 


which describes the behaviour of the field. Additional equations (4.1.4) exist for each 
particle degree-of-freedom g; present. The easiest example is the one-dimensional con- 
tinuous Hooke’s Law (e.g. vibrations in a rod). Our field here will be the amplitude 
(x, t) of the vibration at a point x on the rod. The Lagrangian density is 


1 a a : 
b.n= 5a (2e) =y (e.n) l. 


where u is a constant called the mass density and y is a constant playing the role here 
of k. The first term is the kinetic energy density and the second (up to a pa is the 
strain, or potential energy, in the rod. The field equation (4.1.8) gives us u Ze —y Ze = 
0. This is easy to solve; physically it corresponds to a wave propagating with speed 
y/M. 
Define the momentum x(x, t) = 73 i £ D conjugate to each field g(x, t). Then the field 


equations (4.1.8) can be written as Poisson brackets involving Dirac deltas: 


la, t), ax, H}p = d(x — x’), (4.1.9a) 
{o(x, t), ox, O}p = {m (x, t), n(x, t)}p = 0. (4.1.9b) 


In special relativity, the Lagrangian density £ transforms trivially (i.e. is a ‘scalar’) 
under the Lorentz group, and the fields 6% span various representations R of the Lorentz 
group: that is, pœ) = >>, R(A)ap’ (x) where primes denote quantities in the refer- 
ence frame (or R*-basis) obtained from the unprimed one using Lorentz transformation 
A. 

An example important to physics (but not to us) is electromagnetism. The electro- 
magnetic field has components F» := As _ 24 Where Ag is the electric potential and 
= (A), Az, A3) is the magnetic potential. This field F transforms in a six-dimensional 


ox Ox"? 
nes of the Lorentz group. The Lagrangian density is 


Sy F uv Fop 1 aal Yop er 22 Ja B Ay = =: Fw FY At, 

H, v, æ, B 
where j is the electric current 4-vector describing the distribution and motion of charged 
particles. The matrix n7! 
The second expression is much more transparent, and uses *! to lower/raise indices, 
and summing over repeated indices. Of course to the Lagrangian must be added the 
(relativistic) kinetic energy of the particles or fields. The resulting field equations, called 


arises here in its Riemannian role defining an inner-product. 
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Maxwell’s equations, tell us for instance how charged particles create an electromagnetic 
field. 

We see in Section 4.1.1 that even the simplest classical systems can have singular 
solutions, so the situation for classical field theory can only be worse. Most famous is the 
self-energy of charged particles in electromagnetism, discussed beautifully in chapter 28, 
vol. II of [188]: a charged particle localised to a point has infinite mass coming from the 
electromagnetic field. To see this, imagine that we hold half an electron in our left hand 
and the other half in our right; to make the electron whole we would have to connect these 
two repulsive halves, and an easy calculation (namely the integral — f > r—'dr = 00) says 
this requires infinite energy. This problem persists in its quantisation. 

A remarkable classical field theory is Einstein’s general relativity, in which space- 
time is a pseudo-Riemannian manifold with metric tensor g(x), locally (but not glob- 
ally) equivalent to the Minkowski metric 7. Ignoring for convenience other forces, the 
Lagrangian density for a single particle is 


1 dx” dx” 4 o 
L(x) = smo 2, Em0) ae ac (x —x(t)) + ee —det gR, (4.1.10) 
where G is Newton’s gravitational constant and R is a geometric quantity (a measure 
of the radius of curvature of space-time at x). 84 is the highly singular Dirac delta. The 
numerical constant c? / 167 G, establishing the coupling strength between space-time and 
matter, is chosen so that Einstein’s theory agrees with Newton’s in the appropriate limit. 
Varying the particle’s coordinates x” yields the geodesic equation 


d?x/ dx” dx“ 
Mm = 0, 
dr? $ 3 “dt dt 
describing the straightest possible curves in the manifold (T#, are the Christoffel sym- 
bols). Varying the metric g yields Einstein’s field equations 


1 87G 
Rav — 5R8gw = as 


5 Tii (4.1.11) 


R „v are components of the Ricci tensor and T,,, are those of the stress-energy tensor 
defined below. The left side is geometrical, depending on first and second partial deriva- 
tives of g,,,, while the right side is physical, depending on the matter fields. Einstein’s 
field equations (4.1.11), which tell us how matter and energy curve space-time, consist 
of 10 coupled nonlinear second-order partial differential equations for the components 
Suv: 

The relation between symmetries and conserved quantities in field theory takes the 
following form (generalised in Question 4.1.1). Suppose the Lagrangian density £ is 
invariant under a continuous symmetry a,. Associate with a, the 4-vector 


an 3L (alH) 
io) = aml ), (4.1.12a) 
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for u = 1, 2, 3, 4, called the ‘current’. Then j(x) is conserved, that is it obeys 


4 aj 
ie a 0. (4.1.12b) 
X 
p=1 


This equation tells us to think of j+(x) as the density of some abstract fluid, and j(x) = 
(j 1x), j 2(x), j 3(x)) as its velocity at each space-time point x. Equation (4.1.12b) tells 
us that this ‘fluid’ is neither created nor destroyed, so that the total quantity (‘charge’) 
O(t) = f jœ) dx! dx? dx? (if the integral exists) is constant: so = 

For example, the invariance of the Lagrangian density £ with respect to time and 
space translations x” > x” + a” gives us the ‘current’ T””(x) (one for each v) called 
the stress-energy tensor. The ‘charges’ Q” here are the total momentum and energy. Or 
consider the full Lagrangian density for the coupling of the electromagnetic field F to a 
complex scalar field @ with mass m, charge e and potential V: 


L= PE XO Fu F” +») (= = ieAy) (o (= +iea") 6 mp ¢—V(¢"¢). 
(4.1.13) 


The terms only involving ¢ and ¢* form the Lagrangian for the field ¢ alone, while 
the terms involving both @ and A define the interaction. Note that there is a U} group 
symmetry of L, which acts trivially on F and A but acts on ¢ by a,(@) = e'**. Then 
Q is indeed proportional to e. We return to this example next section. 


Question 4.1.1. (a) Prove the following generalisation of Noether’s Theorem. Suppose 
we have a continuous family a, of diffeomorphisms of configuration space such that 


aoe d 
L(as(q), &s(q)) = Lq, 4) + gf: å), 
for some function A. First, verify that q(t) is a possible trajectory iff a,(¢(t)) is. Next, 


verify that the quantity 
OL (Ids 
o-=( oO) a 
oq Os 


is constant along any trajectory. 
(b) The Lagrangian for a free Newtonian particle is L = imx?. Take œ;(x) = x + s a for 
some constant vector a € R°. Find A here, and verify that the ‘charge’ Q is m x(0). 


Question 4.1.2. Verify that the space C°(T*M) of observables, with bracket given by 
(4.1.6a), defines a Lie algebra, and that the first integrals form a Lie subalgebra. 


4.2 Quantum physics 


We tend to have a naive view of progress in science, namely that the old theory gets 
superseded by a new theory that is better in every meaningful respect: any phenomenon 
the older theory could explain, and any question the older theory could answer, the 
new theory would explain and answer at least as accurately; moreover, there would 
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be phenomena and questions that the older theory avoids but the newer, better theory 
handles adroitly. In reality, progress in science (in contrast to progress in technology) has 
much in common with progress in popular music or in, say, America’s ability to elect 
great presidents. Copernicus’ circular orbits match observation worse than Ptolemy’s 
epicycles. More significantly, Copernicus required the Earth to move at incredible speeds, 
which mysteriously no experiment could ever detect (e.g. when we jump straight up, we 
come straight down). Ptolemy himself rejected the heliocentric hypothesis for these and 
several other good reasons. It was only after Galileo explained the role of inertia, after 
Copernicus’ time, that Copernicus’ unoriginal idea became scientifically reasonable. Of 
course to us today all motion is relative and the proceedings of that Great Debate belong in 
the voluminous Library-of-Dead-Religions. For another example, Aristotelian physics 
regarded friction as fundamental and the pendulum as complicated derived motion, 
whereas Newtonian physics regarded the pendulum as simple and friction as compound. 
In fact, classical physics never successfully explained friction — our present explanation 
requires quantum mechanics to correctly handle the relevant molecular forces (namely 
the van der Waals forces, which are residuals of the underlying electromagnetic forces). 
At least in part, ‘progress’ in science is a sociological phenomenon, a mantra bubbling 
on the lips of scientists as they pursue questions they are willing and able to address. 

In any case, the conceptually and mathematically elegant classical mechanics has 
been superseded by the fairly incoherent quantum physics. A century has passed since 
the birth of the quantum, and although almost all physicists today regard quantum theory 
as having successfully transcended classical physics, it is dangerous to conclude much 
from this. But one thing is certain: mathematics has been a great beneficiary of this 
‘transcendence’. 


4.2.1 Nonrelativistic quantum mechanics 


For fixed time ż, the state of a single particle in quantum mechanics can be captured 
by a complex-valued wave-function x +> w(x, t). Its interpretation is rather different 
from ‘state’ in classical physics: the quantity |Y (x, t)|? is the probability density that the 
particle is at position x at time ¢. Probability arises here not because of uncertainty of 
our knowledge, nor because of unavoidable disturbances caused by our heavy-handed 
measuring processes. Rather, it is a fundamental ingredient of quantum reality. God’s 
analysis too would stop at this probability. 

Recall the discussion of Hilbert spaces in Section 1.3.1, in particular the rigged 
Hilbert space S(R”) C L?(R”) c S(R")*, where the Schwartz space S(R”) consists of 
all smooth functions falling off with their derivatives to 0 quickly as |x| — oo and where 
the Hilbert space L?(R”) consists of the square-integrable functions with inner-product 


G= f FOV dx. 


For each time ż, the span of the possible time-slices (states) w(x, t) form the Schwartz 
space S = S(R’), while their topological span forms the Hilbert space H = L?(R°). 
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We require the wave-function y to be normalised: (yY, Y)(t) = 1 Vt. Observables here 
correspond to self-adjoint operators A:S— S. For example, the operator associated 
with measuring the ith coordinate of position takes y > x;y, while energy is associated 
with the operator ind 3 z (we can use (4.2.1) below to express it using spatial derivatives) 
and the ith coniponent of momentum with the operator —ihs. 

The role of phase space is (loosely) played here by the pio eoio S/C, since the 
physical states corresponding to nonzero multiples cw are the same. This is significant 
because it tells us that groups can act on S via projective representations, and still be 
well-defined. This persists in all quantum theories and has many consequences. Not all 
w e S though are actually physical states — for example, it appears that every physical 
state must have a definite electric charge, that is be an eigenvector of some charge 
operator, and of course most y € S aren’t. 

There are two independent ways the wave-function evolves in time. The first way is 
through Schrédinger’s equation, which is the linear partial differential equation 

ay 


a= 
ha =a V+ VOY, (4.2.1) 


where V is the potential energy aeh acts multiplicatively on y), Ais Planck’s constant 

and V? is the Laplacian 2 ae. +2 ax + 2 ax Schrédinger’s equation governs the determin- 

istic, unitary evolution of y A i between measurements. It is standard to choose 

units so that Planck’s constant A equals 1 (recall the discussion in Section 4.1.1); however, 

in units natural to our familiar macroscopic world (e.g. metres, kilograms and seconds) 

its magnitude (about 10734) emphasises just how invisible quantum effects are to us. 
Schrédinger’s equation can be formally integrated, and we obtain 


w(x, t) = U(t) W(x, 0), (4.2.2) 


where U (t) = exp[—iH t/h] is a unitary operator on S (hence H) for the Hamiltonian 
operator H given by the right side of (4.2.1). Conversely, we could have anticipated 
(4.2.1) by the following reasoning. The time evolution (4.2.2) should be given by a lin- 
ear operator U (t) independent of w (so U (s) U (t) = U (s + t)), which preserves the nor- 
malisation: ||U (t) w(x, 0)|| = || W(x, HI| = 1. This implies that U (t) = exp[iH’t], that is 
dv/dt = iH' y, for some self-adjoint operator H’. For physical reasons we would expect 
H' to have something to do with energy, that is the classical Hamiltonian H, since energy 
is the conjugate observable to time just as momentum is to position. Indeed, Schrédinger’s 
equation (4.2.1) comes from the nonrelativistic formula for energy (E = +P? +V), 
together with the quantum mechanical substitutions E > ih 2 and p > —ihV. 

The second type of wave-function evolution is indeterministic and discontinuous, 
and occurs at the instant tọ when a measurement is made. Let A be the self-adjoint 
operator corresponding to the observable being measured. Assume for simplicity that 
its spectrum (i.e. its set of eigenvalues) is discrete and nondegenerate. Then there is an 
orthonormal set {q(x)} C S of eigenvectors spanning H (topologically). So AWa = a Pa 
and (Wa, Wp) = dap. If Y is the wave-function of the particle being observed, write 
W(X, to) = } a CaWa(x). The result of the observation will be one of the eigenvalues a, 
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ao say, but which one cannot be predicted in advance. All that can be said is that |c4|? is the 
probability that a will be the one observed. Nothing was responsible for the given eigen- 
value dy arising — two completely identical quantum systems can (and usually will) yield 
different observed values. At time tọ the wave-function w suffers a spontaneous and dis- 
continuous change Y > Wa, (or more generally the orthogonal projection of y into the 
do-eigenspace). For times immediately after fo, the wave-function then proceeds to evolve 
by (4.2.1). This second type of evolution is necessary for the experimental consistency 
of the theory: experimental results can be reproduced! It is a truly physical evolution, 
and not merely book-keeping reflecting a change in our knowledge of the system. 

For example, the simultaneous eigenvalues p = (p1, p2, p3) € R? of the three momen- 
tum operators correspond to eigenfunction W(x, t) = eiP*/h while the simultaneous 
eigenvalues a = (a1, a2, a3) € R? of the position operators have eigenfunctions given 
by the three-dimensional Dirac delta 5*(x — a). These spectra aren’t discrete and (gener- 
alised) eigenfunctions aren’t square-integrable (rather they are tempered distributions — 
Section 1.3.1), because exact position and momentum observations in quantum theory are 
nonphysical idealisations (e.g. probing infinitesimal distances requires infinite energy). 
Moreover, since the position and momentum operators don’t share any eigenvectors, it 
is meaningless to speak simultaneously of the (numerical) position and momentum of a 
particle: in quantum mechanics a particle cannot have a well-defined trajectory. 

This framework generalises in the obvious ways. For n particles, the wave-function y 
looks like w(x;, ...,X,, t) and on the right side of (4.2.1) the Laplacian y? get replaced 
by the sum of n Laplacians V?, one for each x;. 

This treatment of many particles indicates a weak point of quantum mechanics. Exper- 
iment tells us that the number of elementary particles can change, for example, a muon 
can decay into an electron and two neutrinos. It is rather difficult to believe that the 
fundamental equation of motion in physics changes discontinuously with time, but that 
is how quantum mechanics would model the decay of, for example, the muon: at some 
time fo the wave-function would acquire six more variables and Schrédinger’s equation 
six more terms. The way out (Section 4.2.2) simultaneously handles all numbers of 
particles. 

The fascinating measurement problem of quantum physics, present in any quantum 
theory, is the struggle to understand this dichotomy of wave-function evolutions. What is 
so special about measurement, that it should obey special laws? After all, surely a mea- 
surement is merely a certain kind of physical process. Many remarkable elaborations 
have been proposed by respected physicists, for example, that the universe splits into 
different ‘parallel universes’ after each measurement, or that a measurement involves the 
imposition of mind on matter. Precisely what constitutes a measurement? Any quantum 
measurement involves the amplification of a microscopic quantum property or effect to 
a macroscopic one. What does quantum physics tell us about the macroscopic (classical) 
world? The linearity of Schrédinger’s equation implies that linear combinations (‘super- 
positions’) of solutions will again be solutions. Now, microscopic superpositions are 
well-observed and fundamental to the theory; during a quantum measurement (if not at 
other times) macroscopic superpositions should be unavoidable. However, what would 
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(a) (b) (c) 
Fig. 4.4 The golf ball experiment. 


a macroscopic superposition look like? Why have we never observed the superposition 
of, for example, a live and dead cat? We are led to the suspicion that quantum physics is 
incompatible with our most elementary qualitative observations of (macroscopic) phys- 
ical reality. 

To make this more precise, consider the situation depicted in Figure 4.4, where a 
machine randomly putts golf balls towards two barriers, one behind the other. When 
a hole is cut into the first barrier, as in Figure 4.4(a), the balls that reach the second 
barrier (i.e. pass through the hole) will impact it at roughly the same spot — the tra- 
jectories of golf balls over short distances are approximately linear. And if we cut two 
holes into the first barrier, we will get the result depicted in Figure 4.4(b). (We ignore 
all balls that get stopped by the first barrier.) Now suppose that whenever we avert our 
eyes for a few minutes, the golf balls make instead the impact pattern of Figure 4.4(c). 
That unbelievable phenomenon would suggest that changing the nature of our obser- 
vation can dramatically affect golf ball trajectories. Classically, there is no evidence of 
this. 

Of course that is precisely what occurs in the remarkable two-slit experiment, where 
electrons are fired at a screen. The electron wave-function y is the normalised super- 
position Wa + Wp) of wave-functions corresponding to travel through the a-slit or 
the b-slit. Individually, the wave-functions q(x, t) and W(x, t) both give rise to the 
probability density (for the arrival spot on the screen behind the two slits) we would 
expect from the golf balls of Figure 4.4(a). However, their superposition w gives rise 
to probabilities 5 Wa + Yl? £ 5 \Wal? + Siu |? — the two possible paths of the electron 
interfere with each other, much as they would if an electron were, for example, a water 
ripple. If we were to try to detect which slit the electron goes through, say by setting 
up a detector at each slit (as in Figure 4.4(b)), this additional measurement would first 
‘collapse’ yw into either Ya or Yp (with equal probabilities). The resulting probability 
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density for the arrival spot would be the particle-like |W|? or |W»|?, respectively (or 
ilya 2+ iyl? if we don’t keep track of which slit the electron passed through).° 

So why can’t macroscopic states interfere? The special feature (‘decoherence’) of 
a macroscopic system seems to be that it is under unavoidable continuous interaction 
with the environment, through gravity if nothing else. Macroscopically distinct states 
(e.g. different pointer positions on an instrument, or golf balls rolling through different 
holes) couple differently to the environment, and so the macroscopic system becomes 
thoroughly and irreversibly entangled with the environment. This entanglement is essen- 
tially irreversible because any interaction that succeeded in untangling the coupling of the 
state with the environment would require enormous numbers (107’ or so) of degrees of 
freedom to conspire appropriately. This has the effect of making the macroscopic states 
essentially “decohere’ from each other, that is, the interference terms iy AWB + iy B Wa, 
when expanded into the disordered microscopic degrees of freedom, get averaged away 
to zero. To get the flavour of decoherence, consider the wave-functions 4, describing 
classical objects A, B. They are actually functions of 107 or so space variables x; j, but 
because they are macroscopic we would expect them effectively to be functions of our 
familiar three-dimensional space. Moreover, they would be essentially localised in this 
space, so |Wa(x, t) + YB, tl? = Wax, OP? + Yra, DI, provided A and B are sit- 
uated a macroscopic distance apart (i.e. provided the supports of the effective functions 
wa and yz are disjoint). This is decoherence. 

Of course alone this doesn’t resolve the measurement problem. At best decoher- 
ence can only explain why macroscopically distinct states in superpositions don’t ‘see’ 
each other. A (perhaps overly zealous) application of quantum mechanics insists that 
macroscopic superpositions must occur; from this, the “Many-Worlds’ interpretation is 
inevitable. The explanation for the mysterious wave-function collapse then would be that 
measurement entangles the quantum system Y1 = >> c;y/ with a macroscopic system 
WY“ —that is, via Schrédinger’s equation, the decoupled wave-function Y1 w° relevant just 
prior to measurement would be replaced with the coupled wave-function Y` c; Yf y£] just 
after. Each coupled state (‘world’) yw in this superposition would decohere from the 
others, and so the various quantum states yf could no longer ‘see’ each other. It would 
be as if at the moment of measurement, the universe split into parallel universes, one 
for each possible experimental outcome. The ‘Many-Worlds’ interpretation is quantum 
mechanics in its purest form; in this framework measurement is a physical process sub- 
ject only to Schrédinger’s equation, and neither wave-function collapse nor the splitting 
of universes actually occurs. The price of this demystification of measurement is a real- 
ity in which almost everything is hidden from us, including infinitely many near-copies 
of ourselves. A derivation of sorts of the probability rule is also possible within this 
framework. 


5 We shouldn’t over-emphasise this ‘wave—particle duality’. ‘Waves’ and ‘particles’ are classical metaphors; 
an electron is neither. Even the name ‘wave-function’ for y is an anachronism going back to de Broglie’s 
hypothesis that an electron behaves like a wave with wavelength h/p. 

6 In defence of this uncomfortable aspect of Many-Worlds, Nature — unlike us — clearly loves enormous 
numbers of nearly identical copies. Consider blades of grass in a field, or water molecules in a lake (or 


246 Conformal field theory 


We’ ve only sketched one possible interpretation. There are many others. For instance, 
the presence of probability in quantum mechanics strongly suggests that we are ignor- 
ing certain degrees of freedom — after all, this is what probability signifies in classical 
mechanics. It is possible to formulate quantum mechanics as a deterministic classical 
theory, by introducing ‘hidden variables’. In the case of one particle, these hidden 
degrees of freedom would be the position coordinates x(t) of the particle. The coor- 
dinates x(t) obey a differential equation involving the wave-function w, which in turn 
obeys Schrédinger’s equation. A similar formulation can be made for any number of 
particles. However, “Bell’s Theorem’ says that any multi-particle hidden variables the- 
ory must possess the notorious feature called ‘nonlocality’. This means that an influence 
(e.g. measurement) done on one particle can instantaneously affect the state of a dis- 
tant particle. Nonlocality in a theory warns of possible difficulties in making the theory 
relativistic. 

The approaches to the quantum measurement problem illustrate the desperate imagi- 
nation that squirts from our pores when we’re backed into a corner. See the book [556] 
for more details, examples and references to the literature. Like any other metaphysical 
doctrine, an interpretation is chosen not for its approximation to Truth, but because we 
find intriguing (and publishable!) the avenues of study it suggests. 

For a one-dimensional example of a quantum system, consider once again the harmonic 
oscillator. The potential is V = —4x?, so Schrédinger’s equation here reads 

2 92 
ine = r i + a y. (4.2.3a) 
Because the potential V is independent of time, this is separable into energy eigenstates: 
write w(x, t) = eE" Py p(x), where 
R Yg k 
2m dx? a E 


x? r) We =0. (4.2.3b) 


In order for y to be normalisable, we require the boundary conditions w(x, t) — 0 
as |x| — oo; this implies (with a little work) that E = (n + Da JE for n € N, that is 


energy is quantised and bounded from below. 
if 
A useful idealisation is the step-function potential V (x) = Dae ty , where 
Vo otherwise 


Vo is constant. Solving the corresponding one-dimensional Schrédinger’s equation with 
the requirement that both y and its derivative a be continuous at x = 0, we obtain 


Aexp(ip+x/h) for x > 0 


wort) = een 
exp(ip_x/h) + Bexp(—ip_x/h) forx <0 


where p} = /2m(E — Vo) and p- = V2m E are the classical momenta (at least for 


E > Vo), and A= ETA and B = 5 aa Physically, this describes a wave (energy 


perhaps research publications?). Or more to the point, consider the uncountably many moments making up 
each life. 
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eigenstate) travelling to the right from x = —oo, with energy E > 0; it hits the wall at 
x = 0, part of it continuing to positive x and some of it reflecting back to negative x. If 
we were to measure whether or not reflection happened, we would find that reflection 
happened with probability |B|? = 1 — |A|?. Note that we get some very nonclassical 
behaviour: classically, when E > Vo the whole wave would be transmitted to positive 
x, but here some of the wave is reflected, even when Vo < 0! It is as if we are about 
to tumble over Niagara Falls in a barrel, only to bounce back the instant we reach the 
precipice. Related to this is quantum tunnelling (Question 4.2.2). 

Quantum mechanics was born around 1926 when Schrödinger obtained (4.2.1) and, 
simultaneously, when Heisenberg and others developed an equivalent formulation. 
Unlike Schrédinger’s picture, in Heisenberg’s the state Y of the system is regarded as 
constant in time, and the time-evolution is carried by the observables A. Itis completely 
analogous to the two attitudes towards observables carried in classical mechanics: we 
can view an observable A(q, p) as a time-independent C~-function on phase space, or 
we can regard it as a function A(q(t), p(t)) of time. The equivalence between these two 
pictures of quantum mechanics is straightforward: the Heisenberg state VW € S can be 
taken to be the wave-function y(x, 0) at ti time ¢ = 0, while the Heisenberg Operator A(t) 
corresponds to Schrodinger’ s operator A via the relation A(t) = = U(t)” VAU (t), where 
U(t)= exp[—iH t/h] as before. Differentiating, we find that the equation of motion in 
Heisenberg’s picture is given by commutation with H: 


OAS (AGA 4.2.4 
qi =z AO, ]. (4.2.4) 


In relativistic quantum theory, Heisenberg’s picture is more convenient because time 
doesn’t play as privileged a role. In particular, just as U (t) describes translations in time, 
aunitary operator V (x) describes translations in space, and so we can regard the state W as 
independent also of space. More generally, we have a unitary (projective) representation 
(a, A) +> Uca,a) of the Poincaré group, acting on the infinite-dimensional space of states. 

Equation (4.2.4) should look familiar: it is formally identical to the classical evolution 
(4.1.6b) of observables, provided we replace the Poisson bracket of classical observables 
there with the commutator of the quantum observables (up to the factor if). Other 
examples of this are the calculations {x, p}p = 1 and [x, p] = if. In other words, 
the process (‘quantisation’) of going from classical mechanics to the corresponding 
quantum mechanics defines a representation of the Lie algebra C(T * M) (with Poisson 
bracket) into the Hilbert space H. However, this quantisation is clouded somewhat by the 
observation that the classical space C% (T* M) is also an associative commutative algebra 
using pointwise product (fg)(y) = f(y) g(y) of the functions, and that this product is 
also important as it is how we can build up general observables from the elementary 
ones x;, p;. Unfortunately, there is no direct analogue of this second product for the 
space of self-adjoint operators on H (or S). The closest would be the operation A x B = 
5(AB + BA), which makes the space of quantum operators into a (non-associative) 
Jordan algebra, originally named after the quantum physicist Pascual Jordan but now 
part of standard algebraic repertoire. 
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An alternate, rather intriguing approach to quantisation seeks to formulate quantum 
mechanics in terms of a one-parameter deformation of the pointwise product algebra 
A= C%™(T*M) (see [141] for a review). In particular, let A[[A]] denote the space of 
all formal power series in A with coefficients in A. We add these power series term 
by term in the obvious way, but the product in A[[A]] is more complicated (though 
necessarily associative). Expand out the product: f x g = 2.9 C(f, 8) à*, where for 
each f, 8, C(f, g) € A. Because it is a deformation we require Co(f, g) to equal the 
usual pointwise product fg. In order to relate this to quantum mechanics, we also 
require that the coefficient C1(f, g) — Ci(g, f) of the leading term in the commutator 
f »g— gx f be the Poisson bracket 2{ f, g}p. We think of the deformation parameter 
A as equalling ii/2. The main appeal of this approach to quantum mechanics is that 
classical and quantum mechanics are placed on the same page, so rigorous sense can be 
made of the statement that we recover classical physics from the A — 0 limit. However, 
it can be criticised for making classical mechanics logically prior to quantum mechanics, 
when the reverse would seem more natural. Also there are some quantum mechanical 
systems that don’t seem to have a classical analogue. Kontsevich was awarded his Fields 
medal in 1998 in part for his proof that such a deformation exists not only for any phase 
space X = T*M (this was known before), but more generally for any differentiable 
manifold X on which can be defined a Poisson bracket (a Lie algebra structure for 
c™(X)). 

Consider the harmonic oscillator in Heisenberg’s picture. The possible states span a 
space S, dense in a Hilbert space H. Define the operators 


Z (kmy'/4 [+ : | a ONE E 3 J (4.2.5) 
me Se ` eee = x 1 Le 
OD Pt Tem ae UP en” 


acting on S. These are called annihilation and creation operators, respectively. Note that 
[@, a] = I, the identity operator. Hence /,@, @' define a representation of Seis (1.4.3) 
on the infinite-dimensional space S. Let’s find a more explicit realisation of this repre- 
sentation. This requires identifying the vacuum state |0) € S, that is an eigenvector of the 
Hamiltonian H with minimal eigenvalue (i.e. a state with lowest energy), normalised so 
that |||0) || = 1. Physically, the vacuum denotes the ground sate, conanne no particles. 


The energy operator, that is the Hamiltonian, becomes H= a ee X = (aT + Da 
(as usual it is time-independent). The vacuum obeys @|0) = 0 (why?) and has energy 


Eo = An ft (i.e. that is its H -eigenvalue). Assume that the vacuum is nondegener- 
ate, that is the eigenspace associated with energy Eo has dimension 1 — a degenerate 
vacuum would correspond to a number of non-interacting equivalent oscillators work- 
ing in parallel. This assumption implies that the vacuum vector will be unique up to 
a phase e'“|0) (choose one), and that the vacuum state is well-defined. Define vectors 
ln) := (n!)~2(@ty"|0). This curious notation is due to Dirac: the functional («| € S* 
is called a bra, the vector |x) € S a ket, and the evaluation (*|*) € C a bra(c)ket. 
This bracket also captures inner-products, using the adjoint |x)’ = (x|. Note that |n) 
has norm 1, and itis an eigenvector of H with eigenvalue En: := (2n + 1)Eo. Construct 
the operator N =ã a, then N |n) = n |n}. We are to think of N as a number operator, as 
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Fig. 4.5 A collision of two identical particles. 


it counts the number of quanta (or excitations or quantum particles) in the given state. 
We say that the operator & creates a quanta, and @ annihilates a quanta. The vectors 
|n) (n = 0, 1,2,...) form an orthonormal set; the state space S here consists of all 
aac Cn |n) with $` n” |c,| < œœ for all m, while the Hilbert space H here consists of 
all X c, |n) with X |c,|? < oo. In this algebraic way we can recover all of the physics. 

When our system consists of a number of subsystems (e.g. different particles), the 
collective Hilbert space H will be given by the tensor product H1 ® - - - @ Hn of the indi- 
vidual Hilbert spaces (this was implicit in our treatment of measurement, where the two 
subsystems were the observed and the observer). Given vectors v; € H;, we are to think 
of the ‘diagonal’ vector vı ®--- Q V, =: |vj,..., Vn) as describing the situation where 
subsystem į is in state v;. However, as we know, a typical vector u in the tensor product 
H won't be of this diagonal form. Only for such states |v;,..., v,) do the subsystems 
themselves possess well-defined states. Even if the system begins in diagonal form (e.g. 
we start with two distant particles), it will lose this as soon as the subsystems interact. 
In this way, interacting systems lose their independent existence. This entangling of 
quantum subsystems doesn’t occur in classical mechanics. 

Something special, and also nonclassical, happens when the subsystems are identical 
(i.e. the subsystems obey identical laws, and differ only in incidental characteristics such 
as position). The collective Hilbert space H now is smaller than the full tensor product: 
it will be the symmetric product of n copies of the subsystem H1. More precisely, H is 
spanned by ‘symmetric’ vectors of the form | v1, ..., Un) ‘= Ti Loes Vol B: @ Von- 
The physical reason for this is given in Figure 4.5. The first two diagrams represent 
classically distinct scatterings, but in quantum mechanics trajectories don’t exist and 
we can’t tell whether it is particle 1 or rather particle 2 moving northwest after the 
collision — Figure 4.5(c) applies. The labels ‘1’ and ‘2’ have no physical significance 
here: the vectors |v;, v2) and |v2, vı) now correspond to the same state — namely, the 
one where one of the particles (we cannot ask which) is in state vı and the other is 
in state v2 — and should be identified. Perhaps we can say that here is the precise pen 
with which This August Personage signed That Important Document, but we cannot say 
(pointing) that this electron here was part of the pen at that Propitious Moment. An easy 
combinatorial consequence of this is that the identical particles here (but not those in the 
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next paragraph!) tend to clump into similar states. This is responsible, for instance, for 
the existence of the laser. 

Recall however that proportional vectors in the space S correspond to physically 
equivalent states. Thus it merely suffices to identify, for example, |v;, v2) and |v2, v1) up 
to a scalar factor. The preceding paragraph describes the bosons like photons of light 
(named after S. N. Bose, who with Einstein first considered their statistical mechan- 
ics). The next simplest possibility, describing the fermions such as electrons, obeys 
|vi, v2) = —|v2, vı). Their Hilbert space is spanned by antisymmetric vectors of the 
form |v,,..., Un) = Ta ces, (—1)? vo Q- -+ Q Ven, where “(—1)°’ equals +1 for an 
even/odd permutation o, respectively. Note that antisymmetry forbids two fermions 
from sharing the same state. This simple fact is directly responsible for the remarkable 
diversity of chemical compounds, for if electrons obeyed instead the bosonic possibility 
|v1, V2) = +|v2, vı), then there wouldn’t be a chemical difference between the elements 
hydrogen, helium, lithium, . . . Itis also responsible for large-scale structure, for example, 
why we don’t fall through the floor. 

These bosonic and fermionic ‘statistics’ correspond to the two one-dimensional repre- 
sentations of the symmetric group S,, but there are other possibilities (e.g. parastatistics, 
which involves higher-dimensional representations of S,, and braid statistics, which 
can occur when space-time is two-dimensional — both are discussed in, for example, 
chapter IV of [269]). However, only bosons and fermions seem to arise in Nature (except 
perhaps for some compound systems). Assuming this, a deep result of quantum field the- 
ory (Fierz and Pauli’s Spin-Statistics Theorem — for a proof see section 4-4 of [518]) 
relates statistics to the Poincaré group. In particular, particles in relativistic quantum 
mechanics carry a representation of the universal cover of the Poincaré group. When 
that representation reduces to a representation of the Poincaré group itself, that is when 
spatial rotations through 27 correspond to the identity (we say the ‘spin’ is an integer), 
then the particle is a boson. Otherwise, that is when rotations through 27 correspond to 
—I (so the spin is a half-integer), the particle will be a fermion. A connection between 
spin and statistics can be anticipated by the observation that the simple exchange of 
locations of two objects involves an implicit rotation by 27 of one relative to the other. 
We discuss this further in Section 4.3.5 below. 

An important formulation of quantum physics is due to Feynman, and starts from an 
observation of Dirac: the infinitesimal quantum mechanical amplitude is governed by the 
value of the classical action (4.1.3). Suppose we know the wave-function x œ> W(x, t;) 
at some fixed initial time t;. Then y at some other time ft is given by 


W(x" th) = [Ko xX 5tp t) WX, ti) Qx, (4.2.6a) 


where K, called the ‘propagation kernel’, is the amplitude for a particle to go from 
position x’ at time ¢; to position x” at time ty. The point is that K is given by the 
‘path integral’ f exp(i S(x)/h) Dx over all paths x : t > x(t) with endpoints x(t;) = 
x’, x(t¢) = x”. For each choice of path x(t), S(x) here is the classical action S L(x, x) dt. 
Integrals over spaces of paths arise here for much the same reason that the entries of 
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Fig. 4.6 Feynman diagrams in quantum mechanics. 


powers A” of a matrix could be described as sums over length-n walks through the 
entries of A. The path integral formulation intuits that the particle takes every conceivable 
trajectory from (x’, t;) to (x”, tp), and each of these (appropriately weighted) contributes 
to the amplitude K and hence probability |K |*. The precise mathematical meaning of 
Feynman’s path integral is a little elusive, but attempts to define it in terms of, for 
example, Wiener integrals have been made. It is probably simplest though to regard it 
heuristically, as is done in Section 4.4.1. 

Consider the classical limit A — 0 of (4.2.6a): using the stationary phase approxi- 
mation, the dominant path x’(t) in the Feynman integral is one that satisfies the Euler- 
Lagrange equation (4.1.4). This provides an explanation for the mysteriously teleological 
Hamilton’s principle of classical mechanics, discussed in Section 4.1.1. 

The perturbative approach to quantum theories is particularly transparent in the path 
integral formalism. Write the Lagrangian as the sum L = Lo + AL jp; of the free part Lo 
and the interaction part AL;,, = —AV; the ‘coupling constant’ A is a numerical constant 
(hopefully small), and we aim to expand the kernel K (and hence the wave-function y) 
in a Taylor expansion in à. Explicitly, we have 


ee 
Ko xy- = f exp Bi (Lo Av) | Dx 
ti 


= [ex lz [toa] y =) voar) Dx. 
ti n=0 © ti 


(4.2.6b) 


We can represent this pictorially. The n = 0 term describes a particle propagating freely 
from (x’, t;) to (x”, tf); the Feynman diagram for this term is given in Figure 4.6(a). The 
n = | term describes a particle propagating freely from (x’, t;) to some intermediate point 
(x, tı), at which instant the potential V acts multiplicatively, and then the particle resumes 
free propagation to the final position (x”, tf); we then integrate over all intermediate times 
(and finally over all paths x(t )). The Feynman diagram is given in Figure 4.6(b), where the 
integration over ¢, is implicit. The kink there is called a ‘vertex’ — this is the same word as 
in vertex operator algebra. Likewise, the à” term corresponds to a Feynman diagram with 
n vertices, corresponding to the n integrals f V dt; in (4.2.6b). The factor n! in (4.2.6b) 
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is removed by taking these intermediate times in the order t; < ti <--- < fn < tf, as 
the diagrams suggest. In this way, we have replaced the actual physical situation, where 
of course the interaction V is always present, with a situation where the interaction is 
only present at discrete moments of time. It is as if the particle only interacts with V at 
the vertices. These are called virtual interactions, as they are mathematical artifacts and 
don’t correspond directly to actual events in Nature. 

We’ll say more about perturbations and Feynman diagrams later. Typically, the sum 
(4.2.6b) won’t converge, but the first few terms (when interpreted correctly) give good 
comparison with experiment. Conformal field theory — the physics of Moonshine — arises 
from the perturbative expansion of the quantum field theory called string theory. 

Its treatment of measurement demonstrates that quantum mechanics is heuristic and 
idealised, and not at all in its finished form. But just as classical physics achieved a pro- 
found understanding of the concept of ‘rest’, and relativity provided a deep reanalysis of 
space and time, so is quantum mechanics forcing us to reconsider the seemingly harmless 
notion of observation. After all, we never observe an object, but rather the interaction 
between objects. Also profound, quantum mechanics teaches us that interacting subsys- 
tems become entangled, and physically this means that the whole is indeed much more 
than the disjoint union of its parts. 


4.2.2 Informal quantum field theory 


It is surprising that the next three natural tasks — namely, to bring in special relativity, to 
handle the experimental fact that the number of elementary particles can change, and to 
quantise classical field theories — are all accommodated by quantum field theories, the 
quantum theories of systems with infinitely many degrees of freedom. The sketch we 
provide here won’t seem very satisfactory, but this is roughly the treatment to be found in 
physics textbooks. We avoid as too tangential most calculational issues and many tech- 
nicalities (e.g. the quirks of fermions). Section 4.2.4 provides a more careful axiomatic 
treatment of quantum field theory, but knowing the informal physics background, at 
least in its broader strokes, is essential. A dated though otherwise excellent treatment of 
quantum field theory, somewhat in our style, is [479]; modern and masterful is [555]. 
To the working physicist, quantum field theory is the following conceptual hierarchy. 


(i) Experiment. The experimenter measures half-lives of particles and scattering 
cross-sections. How well does experiment compare to theory? 

(ii) Amplitudes. These observable quantities depend on the magnitude-squared of the 
appropriate transition amplitude |in) —> |out). Unfortunately, transition 
amplitudes are too hard to calculate from the theory, except in infinite time 
(t — +00) limits, which by definition are the entries of the S-matrix. Those 
limits, though mathematically dubious, are physically intuitive. So the theoretician 
needs to compute the S-matrix. 

(iii) Correlation functions. The typical way to compute S-matrix entries is using 
correlation functions, via the so-called reduction formulae. So the theoretician 
wants to compute correlation functions. 
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(iv) Feynman diagrams. Typically, correlation functions are calculated ‘perturbatively’ 
by Taylor-expanding in some coupling constant. Each term in this (usually 
divergent) infinite series is computed separately using Feynman diagrams. 


Moonshine is interested in the correlation functions of a class of extremely symmetrical 
and well-behaved quantum field theories called rational conformal field theories — these 
theories are so special that their correlation functions can be computed exactly and 
perturbation is not required. But before we turn to them, let’s flesh out some of this 
hierarchy. 

It would seem trivial to make quantum mechanics consistent with special relativity. 
Consider, for simplicity, a free particle of mass m. Recall that Schrddinger’s equation 
(4.2.1) corresponds to the nonrelativistic energy E = =p’. Since relativistic energy 
satisfies E? — p*c? = m?c*, the natural guess for the relativistic Schrödinger equation 
would be 


2 
G — Weve + ne!) x, t) = 0. (4.2.7) 
This is called the Klein—Gordon equation, and was proposed independently by 
Schrodinger, Klein and Gordon shortly after (4.2.1) was written down.’ They expected 
it to describe the relativistic wave-function @ of a free ‘scalar’ particle (i.e. d(x) is 
invariant under the action of the Lorentz group SO; (R) on x), but such a theory is sick 
(see Question 4.2.4): for example, it suffers from negative probabilities and the energy 
eigenvalues have no lower bound (this means that we won’t have a vacuum state |0), 
which is bad). The way to make (4.2.7) into a sensible physical theory is to interpret it 
as a quantum field theory. 

Quantum field theory is far deeper than quantum mechanics, both physically and 
mathematically. Witten predicts [566] that one of the major themes of twenty-first century 
mathematics will involve coming to grips with quantum field theory. 

Let Q C H C Q* be a rigged Hilbert space; Q is the span of the states in the theory, 
and is constructed below, while 1 is their topological span. We obtained nonrelativistic 
quantum mechanics by replacing classical observables by operators, so we would expect 
that the fields g(x) in quantum field theories are operator-valued functions of space-time. 
Unfortunately this is too optimistic, even in the simplest free theories. Rather, the correct 
statement is that quantum fields ọ are operator-valued distributions of space-time: for any 
states u, v € Q, the matrix entries (u, gv) of g are tempered distributions of space-time. 
In other words, the Schwartz space S = S(R*) is a space of test functions of space-time 
that ‘smear’ the fields; the values y(f), foreach f € S, are (unbounded) linear operators 
Q — Q. Nevertheless, it is traditional to write g(x), as if the fields were functions of 
space-time, and informally think of (f) as the integral fi S@)ge(x) d*x. Unlike the 
wave-functions of quantum mechanics, a quantum field is not directly a probability 


7 Apparently, Schrédinger first derived the relativistic equation, noticed that it didn’t work but that its 
nonrelativistic approximation (4.2.1) looked good, and so first published the approximation! See the 
historical discussion on page 4, vol. I of [555]. 


254 Conformal field theory 


amplitude; rather, it is a linear combination of operators that increase or decrease by one 
the numbers of particles in any state. 

Let g1, . . . , @, be the complete list of quantum fields in the theory. All operators (e.g. 
observables) occurring in the theory are constructed from these fields. More precisely, 
locality says that any operator at a given space-time point x is a function of fields and 
their derivatives, all evaluated at that point. 

The mathematical meaning of a theory being (special-)relativistic is that its quantities 
transform nicely with respect to (i.e. in projective representations of) the Lorentz and 
Poincaré groups SOF, and R* 1SO5 |. As in Theorem 3.1.1, those projective represen- 
tations are true representations of the universal covers SL2(C) and R*>SL2(C), respec- 
tively. Firstly, the state space H carries a unitary representation (a, A) > U(a,a) of the 
universal cover of the Poincaré group. These operators U(a, a) send the state space Q onto 
itself; on Q, we can write U(q,;) =: exp[—i } `, au P” /ħ], where the self-adjoint operators 
P are the observables for momentum and (up to a constant) energy. In particular, c? P* is 
the Hamiltonian density. The absence of tachyons (footnote 4 in this chapter) says that the 
simultaneous eigenvalues (p, p*) of the energy-momentum operators P!, P?, P?, P4 all 
have nonpositive Minkowski norm-squared ` u Pup" = p? — c2(p*y =: —m?c?. This 
parameter m is constant in any irreducible representation of Rtx SL2(C), and is called 
the (rest-)mass. 

The span of the fields g; carries a projective representation of all symmetries of the 
theory. In particular, there is an n-dimensional representation V of SL2(C), governing 
how the n fields transform relativistically: that is, 


Uaw Gif UG = > VAD Ga, N.S) (4.2.8a) 
i=l 


holds in Q, where the Poincaré transformation (a, A) € R*»SL,(C) acts on test func- 
tions by ((a, A). f)(x) = f(Ax + a). The inverses on the right side are needed in order 
for (4.2.84) to be consistent with Uw, an © Uca, a) = Uia, Anola, a). Restricting to trans- 
lations R*, the derived representation of (4.2.8a) becomes the important equation of 
motion 


On P(X) = =IP , 9(x)]. (4.2.8b) 


Since the finite-dimensional representations of SL? (C) are completely reducible, we can 
collect the fields together that form irreducible representations, parametrised by Dynkin 
label 4; = N. Mysteriously, physicists prefer to use spin s = à; /2. 

In classical field theory, the particles and fields are phenomenologically independent 
even though they mutually influence each other. In quantum field theory, particles are 
secondary, arising from fields, as we see shortly. A great definition, due to Wigner, is: 


Definition 4.2.1 A particle is an irreducible projective representation of the Poincaré 
group, with real mass m and energy c° p* > 0, in the space H of states of the theory. 


More precisely, the spectra (p, p*) of the energy-momentum operators P” in an irre- 
ducible representation are required to obey p? < c?(p*)*; the mass m > 0 is the constant 
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J/c?(p*)? — p°. Only the vacuum has 0 energy. Unlike the mass, the energy varies within 
the irreducible representation, and for a particle of mass m is never less than mc?. 

Subatomic experiments suggest that there are elementary (i.e. noncomposite) particles, 
for instance electrons. Each species of elementary particle in the theory arises from an 
irreducible SL2(C)-module in the span of the fields g;. In particular, a particle with spin 
SE iN requires 2s + 1 fields g;,,..., @i,,,,, called its components. Other symmetries of 
the theory combine with SL2(C) to form higher-dimensional representations. For exam- 
ple, in quantum electrodynamics,’ ‘parity’ (i.e. the space-reflection x > —x) collects the 
two-component ‘left-’ and ‘right-handed’ electrons into an irreducible four-dimensional 
representation, while in the Standard Model parity is no longer a symmetry, but the left- 
handed electron and neutrino transform together as components in a four-dimensional 
representation of the symmetry group SU3 x SU2 x Uj, while the right-handed electron 
forms a two-dimensional representation by itself. 

A Lagrangian density £(x) here is a self-adjoint operator, invariant under SL2(C), 
built up polynomially from the various g; and 0,,¢;, all evaluated at the same space-time 
point x. Each field g; obeys the corresponding Euler-Lagrange equation (4.1.8). As in 
classical field theory, define the ‘canonical momentum field’ 7;(x) = d£/0(049;) (not to 
be confused with the momentum operators P”). The equal-time commutation relations 


[pi(x, t), m;i, t)] = ihdj;5(x — x’), (4.2.9a) 
[yi(x, t), p(x’, DI = [ni (x, t), w(x’, )] = 0 (4.2.9b) 


are obtained from the classical Poisson brackets (4.1.9) via standard (‘canonical’) quan- 
tisation. When both ¢;, g; are fermionic (i.e. have fractional spin), then (4.2.9) should 
be replaced with anti-commutation relations. For simplicity, we consider only bosonic 
fields. 

Because disturbances shouldn’t travel faster than light, measurements occurring at 
space-time points x, x’ that are space-like separated (i.e. (x — x’)? > 0) should be inde- 
pendent. Quantum theory translates this into the statement that the corresponding observ- 
ables O(x), O'(x') should commute: [O(x), O’(x’)] = 0 when (x — x’)? > 0. Since the 
observables are built out of the fields g;, this is closely related to the commutation 
relations (4.2.9). Nevertheless, the relations (4.2.9) are controversial, as we’ll see. 

To see how to use the field equations and (4.2.9), consider for example the density 


L(x) = > (m*c*h pa + Pappa) a“ b(x)), (4.2.10a) 


where ¢ = @' is self-adjoint. (We will see shortly that this £ has to be modified slightly 
to be physically sensible.) The field equation here is the Klein—Gordon equation (4.2.7). 
It can be solved by a trick: the Fourier transform of ¢@ from ‘position-space’ into 


8 Quantum electrodynamics (‘QED’ for short) is the quantum theory of Maxwell’selectromagnetism applied 
to electrons, positrons (the anti-particle of the electron) and photons (the particle of light). QED is 
subsumed by the Standard Model, the quantum field theory describing all known physics except for gravity. 
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‘momentum-space’ converts the Klein—Gordon equation into decoupled classical simple 
harmonic oscillator equations, so the field ¢ can be formally written 


h Dq i . 
P(x, t) = i Sato, G exp E p-x— cg | 


+4a(p)' exp |-; p:x+ iot | | dp, (4.2.10b) 


where œp = A~! pt = ch! /p? + m2c?. If ġ were a real-valued function, (4.2.10b) 
would give the general solution, for arbitrary coefficients obeying a(p) = a(p)' € C. 
Here the coefficients are operators, with a(p)' the adjoint of @(p) (hence the notation). 
The canonical momentum is x = d4¢. Solving (4.2.10b) for a@(p) and a(p)' in terms of 
ġ, equations (4.2.9) become 


[a(p), a(p’)'] = 8p — p^, [a(p), a(p’)] = [ap ap] =0. (4.2.100) 


This trick of switching from position variables to momentum variables is common in 
field theory, and it isn’t surprising that it should simplify the mathematics: the momentum 
degrees of freedom are uncoupled because the theory is translation-invariant (Noether’s 
Theorem!). If instead ¢ is not self-adjoint, then we should expand ¢ into independent 
coefficients a(p), bpi. 

How do we accommodate particles in quantum field theory? First note that the particle 
interpretation pertains directly to state vectors v € Q, and not the fields — for example, 
our universe corresponds to some vector |universe) € Q. There are, for example, only 
four electron fields (i.e. one component for each internal degree of freedom); all of the 
nearly infinitely many electrons in the universe are created by those fields in a way 
we’ll describe shortly. The number of electrons is an observable quantity, and hence an 
eigenvector of the ‘electron-number’ operator N e. Thus a typical vector v € Q will not 
have a well-defined number of (say) electrons. 

The most important vector in Q is the vacuum state |0) € Q, which contains zero 
particles of each type. It is fixed by the representation of the universal cover of the 
Poincaré group, i.e. Uia,a)|0) = |0), so in particular the state |0) has total momentum 
0 and energy 0. As before, it is unique up to scalar multiplication, nondegenerate and 
has norm 1: (0|0) := |||O) |]? = 1. (Actually, in quantum field theories with spontaneous 
symmetry breaking, such as the Standard Model, the vacuum will be degenerate, but we 
will ignore this possibility here.) 

The particle interpretation is simplest in the free scalar field theory (4.2.10). Equa- 
tions (4.2.10b) and (4.2.10c) tells us to think of the free field ¢ as infinitely many 
independent quantum harmonic oscillators (4.2.5), one for each possible momentum. 
The analogue of the one-particle state |1) there should be the one-particle state |p) 
with momentum p and energy wph, defined by |p) := a(p)' |0). The problem is that its 
normalisation 


Ilp)? = (0| @(pya(p)'|0) = 6(0), 
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obtained using (4.2.10c), is infinite. This is why a quantum field ¢ is an operator-valued 
distribution. The one-particle states can’t have well-defined momenta, but rather are 
“wave-packets’, linear combinations (‘superpositions’) of those momentum states |p) 
constructed using test functions f. In particular, let f be in the Schwartz space S(R**). 
The k-particle states in Q are of the form 


If) Joe 10) dpe --- 3p. 


The state | f) is an eigenvector of the number operator N pS. a(p)'a(p) d3p, with eigen- 
value k. The operators @(p) again are annihilation operators and take a k-particle state to 


a (k — 1)-particle state. Together, all these k-particle states, for k = 0, 1,..., span the 
space 2. The commutation relation [a', a] = 0 means that the particles obey bosonic 
statistics, that is both f € S(R*) and its symmetrisation A Does, f (Poi, ---, Pok) 


define physically identical states. 

Just as a pendulum in classical mechanics undergoes small oscillations about its (ver- 
tical) stationary equilibrium position, so does the vacuum in quantum field theory. The 
oscillations of the quantum vacuum are the electrons, photons, etc. observed in Nature. 
This particle concept is the kinematics of quantum field theory. 

In these free theories, the k particles in | f} move independently and freely. The notion 
of wave-packets explains the tracks of particles in the cloud chambers of high-energy 
experiments: such tracks seem to indicate that the particle has, to a good approximation, 
both a well-defined position and momentum. By contrast, the (nonphysical) momentum 
eigenstates |p) are diffused throughout the universe. 

Similarly, particles in any free quantum field theory arise by interpreting the Fourier 
coefficients of the fields as creation and annihilation operators (theories with interactions 
are considered shortly). Now, any operator can be expressed as an integral of sums and 
products of these creation and annihilation operators (see section 4.2 of [555] for a proof). 
For example, the free scalar theory (4.2.10) has energy-momentum operators 

phe ; | p” (a(p)'a(p) + ap) a(p)') dp. 
Since [No, P4] = 0, we see from (4.2.4) that in this free theory the number of particles 
won’t change. It can change only when we include interactions. 

Note that in the free scalar theory P”|0) = 0 for u = 1, 2, 3, as it should, but P*|0), 
which gives the energy of the vacuum, is 


1 h 
P*/0) = fros (apam + 5) l0) dp = 0+ > f opio, 


so is divergent. This is a typical infinity in quantum field theory, but is easy to remedy, as 
it tells us that the Hamiltonian density H(p) (hence our original Lagrangian density L(x)) 
is off by an additive (infinite) constant. It isn’t surprising in hindsight that the naive guess 
(4.2.10a) for L(x) runs into problems: for one thing, classical energy is only defined up 
to an additive constant; for another, the order in which the numerical coefficients a, at 
appear in classical expressions for energy doesn’t matter, while the order of the operators 
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@, a' in quantum mechanics certainly does. Replacing L(x) and H(p) with their ‘normal 
orders’ :£: and :H:, respectively, gives the vacuum zero energy and doesn’t otherwise 
change the physics. The normal order :O: of an operator O given by an integral over p’s 
of a product of a@(p)’s and a(p)'’s is obtained by moving all annihilation operators @(p) 
to the right of all creation operators @(p)'. This has the effect of making the evaluation of 
operators on states as simple as possible. For example, the Hamiltonian density becomes 


Ja f D” a(p)'a(p) dp. 


The same procedure works in any quantum field theory to give the vacuum zero energy, 
with a minor change when there are fermions. We also used normal-ordering in, for 
example, (3.2.14a) to remove an analogous infinity in Lie theory. 

The existence of negative energy states, which we recall was a serious sickness for 
relativistic quantum mechanics, is handled naturally in quantum field theory. Return for 
simplicity to the scalar theory, but now with œ # ¢'. The positive energy coefficients 
a(p) of ọ annihilate a positive energy particle; the negative energy coefficients b(p)' 
create a positive energy particle. The particle annihilated by the field ¢ is not quite 
the same as the particle created by ¢: The various parameters describing particles will 
either be the same (e.g. mass) or opposite (e.g. electric charge), for these two kinds of 
particles. That is, the pair ¢, ¢! of fields is associated with pairs of particles; one of 
these we arbitrarily call the anti-particle. Physically, an anti-particle can be interpreted 
as the corresponding particle “travelling backwards in time with negative energy’, and 
that is how it is depicted in Feynman diagrams. When ¢ = @', the particle is its own 
anti-particle. 

This is how particles arise in free quantum field theories. The physically interesting 
quantum field theories have interactions, that is additional terms in £(x) corresponding 
to potential energy. Experiments (e.g. the cloud chambers) tell us that a particle inter- 
pretation is still appropriate there. A typical experiment begins and ends with several 
particles separated by macroscopic distances; interactions occur only at intermediate 
times when some particles are microscopically separated. What we observe are the ini- 
tial (‘incoming’) and final (‘outgoing’) states, and the transition probabilities | (out|in) |. 
Now, macroscopically separated particles should behave independently to good accu- 
racy. Thus these initial and final states are described by the corresponding free theory, 
at least in the limits £ — Foo. A particle interpretation applies directly only to these 
asymptotic states. 

In particular, to each field yg; in a quantum field theory? there are fields gi" and go". 
The field equations (4.1.8) for the g; of course include interaction effects, whereas 
the asymptotic fields gi", p>" obey the free field equations, such as the Klein—Gordon 
equation (4.2.7). Because P*|0) = 0, the vacuum is constant in time (‘stable’) and is its 


l 


° Many of the following comments assume the associated particle is stable and can exist in isolation of the 
other particles, at least asymptotically. This is the case, for example, for an electron, but not the muon or 
quark, which are also elementary and have their own fields. See the literature for the necessary 
modifications. 
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own incoming and outgoing asymptotic state. All other incoming states are built up from 
the vacuum |0) and y'" by the process described earlier. The collection of all incoming 
states spans the space Q. Similarly, |0) and yg?" create all outgoing states, and these also 
span Q2. Thus the ‘in-fields’ gin describe the (hypothetical) physics that would occur if the 
initial particles never interacted; the field y; interpolates between these free initial and 
final asymptotic situations (up to a multiplicative constant, as we’ll see), and embodies 
the true physics by carrying the dynamical information of the system. 

As mentioned earlier, experiments obtain information on the transition amplitudes 
(out|in) between (prepared) initial states and the (observed) final states, and the compli- 
cated machinery of quantum field theory is designed to compute these. These inner prod- 
ucts can be thought of as matrix entries of an operator S, the S(cattering)-matrix, which 
defines the equivalence gy = S~'gi"§ between the algebras of in-fields and of out- 
fields, and the equivalence |in) = S |out) between the corresponding incoming and outgo- 
ing states. Without going into the technical details, the so-called ‘Lehmann—Symanzik— 
Zimmermann reduction formulae’ (see e.g. section 7.2 of [479], or section 5-1-3 of [310]) 
express the transition amplitudes in terms of an n-fold integral f d+x, - - - dfx, over space- 
time, of ‘n-point (correlation) functions’, or ‘Green’s functions’, or ‘vacuum-to-vacuum 
expectation values’ of “time-ordered products’ of the physical fields: 


(Gj, (41) +++ Gj, %n)) = OIT Pj G1): Pj, n))I9). (4.2.11) 


We will usually use the statistical term “correlation function’, standard in conformal field 
theory. The symbol ‘T’ here reorders the fields g; (x;) in increasing order of the time 
a and is needed to guarantee convergence. The number n here is the total number of 
particles in |in) and |out) together. 

In classical physics, Noether’s Theorem associates with a continuous symmetry a 
conserved current j“(x) anda conserved charge Q. Now, asymmetry of a classical system 
may become broken in quantisation — this is called an anomaly (see e.g. section 11-5 of 
[310]). Usually an anomaly is bad news, but a harmless anomaly important to us is the 
soft breaking of the conformal symmetry in CFT. It is measured by a parameter called 
the central charge or conformal anomaly c (Section 4.3.1). 

When a symmetry survives quantisation, the analogue of Noether’s Theorem here 
is the Ward identities (see e.g. section 10.4 of [555]), which are differential equations 
satisfied by the correlation functions. They take the form 


ə 
U O PA AD) ++ Gj, An) = -iJ ô% = Xi) (P1) Giph i) Pj, An) 


ax 
(4.2.12) 
where G; is the associated representation of the symmetry on the field @;.,. 
The typical, and only general, way to compute correlation functions is perturbation 
theory. The correlation functions (4.2.11) play the role here of the propagation kernel K 
in (4.2.6a); their path integral expression looks like 


1 
(iG) Ph An) = Z f Pi) Pj, An) expliS(P)/h] Do, (4.2.13a) 
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Py Py Py 
Fig. 4.7 Some two-point Feynman diagrams in the ¢* model. 


where S is the classical action (4.1.3) and the integral f D@ is over the space of complex- 
valued functions R? —> C (one such ‘wave-function’ for each field g; in the theory). The 
normalisation factor 1/Z in (4.2.13a) is 


z= f epis Do, (4.2.13b) 


called a partition function for statistical reasons. We’re glossing over technicalities, but 
the technicalities are (too) easily found in the literature. Once again the mathematical 
meaning (such as it is) of (4.1.13a) is best ignored; more important are the heuristics it 
suggests for perturbation. 

For that purpose consider a toy model: a single self-adjoint scalar field @ = ', with 
¢* interaction term: £ = —5 D u ub" G — im? = io (for typographical clarity we 
adopt here the usual conventions c = fi = 1). As always, the equations are simpler if we 
Fourier-transform to momentum space. The two-point function yields 


(OPDE) = 2r} 8p + pr) |l- 
Piom 


à f 1 dtp 
(P? - m2)? (27)* p? — m? + ie 


— lime>0 tood). azio 
The Dirac delta factor expresses momentum conservation. The integral in (4.2.13c) 
doesn’t converge — this infinity is analogous to the infinite self-energy of the electron 
in classical electromagnetism (Section 4.1.3), and provides the first example of renor- 
malisation, as we will see shortly. The first two terms within the braces of (4.2.13c) 
correspond to the first two diagrams in Figure 4.7. The second diagram can be inter- 
preted as a particle emitting a pair of virtual particles, which then annihilate themselves. 
The four-point function (@(p1) (p2) (p3) o(pa)), computed to A! accuracy, includes 
the diagrams of Figure 4.8. 

The Feynman rules describe how to go from the finitely many Feynman diagrams 
at each perturbation order 4“, to the corresponding integral expressions. Any book on 
quantum field theory (e.g. [310] or [555]) describes them in detail, as they are how the 
theory makes practical contact with experiment. We will make only general remarks. 
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Fig. 4.8 Some four-point Feynman diagrams in the ø* model. 


Fig. 4.9 A typical fourth-order term in the scattering of two electrons. 


We can write (4.2.13a) symbolically as 
f Pa) bj, Xn) expliS (H) Do = X c0) f Į [dpe [28,  (4.2.130) 
G é v 


where the sum is over all Feynman diagrams G with the external lines (i.e. edges with 
a free endpoint) corresponding to the fields @;, in the n-point function. The numeri- 
cal quantity c(G) is combinatorial. For each internal edge e there is a ‘propagator’, a 
momentum p, and an integral over pe. At each vertex v there is an operator Ŷ®,, which 
is proportional to the coupling constant, as well as a Dirac delta ô, which expresses 
momentum conservation at that vertex. Thus each vertex contributes a factor of the cou- 
pling constant (which is assumed to be small). The vertices in Figures 4.7 and 4.8 are 
all of valence 4, because the only interaction term in the Lagrangian density £ here is 
¢*. More interesting (and physically relevant) quantum field theories involve several 
types of particles, with several different interaction terms in the Lagrangian, and so the 
corresponding Feynman diagrams have several types of edges (one for each kind of 
particle) and several kinds of vertices (one for each term in the interaction Lagrangian). 
For example, in QED (footnote 8 in this chapter) the interaction term is —ey Ay, where 
e is the coupling constant (proportional to the charge of the electron) and where w is the 
(multi-component) field of the electron, y (essentially the adjoint of y) can be thought 
of as the positron field and A can be identified with the photon field. A vertex here must 
consist of three particles: a single incoming or outgoing photon, with an incoming and 
outgoing electron or positron. A typical Feynman diagram involved in the calculation of 
the four-point function (Y (p1) W(p2) Ypi) Y( P>)) is shown in Figure 4.9. It describes 
the virtual event where the incoming electrons (the bottom two solid lines) exchange 
a virtual photon (the horizontal wavy line), which in transit spontaneously breaks into 
an electron—positron pair, which then annihilate, returning the photon. All vertices in 
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Fig. 4.10 Feynman diagrams contributing to the mass shift. 


Figure 4.9 are consistent with the interaction term; as there are four of them there, that 
diagram contributes to the e* term. 

In order for an expansion in à” (or e”) to make sense, the individual terms should tend 
to 0 with n. Embarrassingly, in a typical quantum field theory most individual terms are 
infinite! A simple example is the two-point function (4.2.13c) at one loop — the problem 
there is that the integrand doesn’t go to 0 fast enough for large p. A different infinity 
provides a clue how to make sense of these perturbative expansions. 

We know from free field theory that the term — imo? in the #* Lagrangian is a kinetic 
energy term, and so it is tempting to identify m there with the mass of the @ particle. 
However, that parameter m is not directly observable. The (squares of the) true masses of 
the particles are defined to be the corresponding eigenvalues of the operator $- wr ei 
(again ignoring A’s and c’s). The easiest way to compute these eigenvalues is through 
the two-point function (¢(p1) d(p2)) (called the propagator of p): by nonperturbative 
arguments (see e.g. section 10.2 of [555]), the propagator of @ should equal the Dirac 
delta (27 )*64(p' + p”) times ameromorphic function witha simple pole at p’? = m3 (the 
physical mass-squared of the particle)!° with residue i. In the ø4 theory, the propagator 
to zeroth order (corresponding to the free theory) is i/(p'? — m7), ignoring the Dirac 
delta factor. However, the perturbative expansion contains geometric series that change 
the pole. In particular the sequence of diagrams in Figure 4.10 contributes to shifting the 
denominator, and hence the pole, of the propagator. We call the nonphysical parameter 
m appearing in the Lagrangian the “bare mass’, in contrast to the true observed mass 
mg = m — 6m that is ‘dressed’ with the cloud of virtual particles arising by virtue of the 
interaction terms. 

The actual values of m and ôm can be ignored, since in any physically relevant expres- 
sion they appear only in the combination m — ôm, which can be replaced by the measured 


10 There is some evidence (by studying the ‘running coupling constant’) that the propagator of the photon in 
QED has, in addition to the pole at mass zero (corresponding to the massless photon), a pole at imaginary 
mass. This would correspond to a tachyon (footnote 4 in this chapter) called the Landau ghost, which 
presumably shouldn’t exist. This calculation could indicate a fundamental inconsistency with QED at high 
energies, but more conservatively may merely indicate a collapse of the perturbative approximation at 
high energies. Even if each term in the perturbative expansion of QED can be made finite and well-defined 
(which at present requires ad hoc constructions like ‘infrared cut-offs’), the full sum over all perturbative 
orders probably won’t converge in any sense. Indeed, the perturbative expansion is a power series in the 
coupling constant e; if it converged for some small (positive) value of e, then it should also converge for 
some negative values of e, which for physical reasons is impossible. More generally, many suspect that a 
consistent quantum field theory must be ‘asymptotically free’ (i.e. the particles act as if they are free of 
interactions when the momenta are large). QED is not asymptotically free, but the Standard Model is. 
However, the Standard Model has other problems (due to the Higgs scalar field) and many suspect that it 
too is inconsistent. 
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value mg of the physical mass. This is an example of renormalisation, and in itself is a 
standard and uncontroversial ingredient in any physical theory. 

However, the mass shift ôm can be calculated perturbatively, and in a typical quantum 
field theory is infinite. Thus in order to account for the observed masses of the particles, 
the mass parameters in the Lagrangian would also be infinite, which is silly. Nevertheless, 
the renormalisation scheme given in the previous paragraph works to give sensible and 
accurate answers. 

Likewise, the fields ø and coupling constants À — in short, everything! — appearing in 
the Lagrangian are also unobservable. The coupling constants A are renormalised anal- 
ogously to mass, using the observed strengths of the corresponding interaction, and as 
usual the rescaling is by an infinite factor. The physical ‘renormalised’ fields, properly 
interpolating between the incoming and outgoing free fields, are scalar multiples Z 7 29 
of the Lagrangian ‘bare’ fields. This follows, for example, by the residue (call it Zgi) of 
the propagator: it must equal i, but in a theory with interactions we’ ll have Z Æ 1 (in fact 
typically Z, is infinite). In short, the equal-time commutation relation (4.2.9a) (obeyed 
by the bare fields) and the residue i of the propagator (necessarily satisfied by the physical 
fields) are incompatible, and so the bare fields aren’t physical. Once again it is not sur- 
prising that we must renormalise; what is disturbing is that the renormalisation is infinite. 

Quantum field theory makes sense of (i.e. systematically removes) the infinities arising 
in perturbation theory by acombination of two procedures. The first, called regularisation 
(Section 4.2.3), introduces some new parameter, call it A, and replaces the divergent 
quantity by a limit as A goes to oo, say, of finite quantities. This nonphysical parameter 
A may be a large momentum cutoff (which corresponds to a small distance cutoff), 
although more sophisticated cutoffs are common. As long as A is finite, the calculation 
will also be finite, but it will depend on A (as well as the various parameters m, À, ... in 
the Lagrangian). However, if we choose (‘renormalise’) those parameters m, A, ... $0 as 
to depend on A in such a way that the physically relevant quantities are independent of 
A (or at least have a finite limit), we can then take the limit A — oo and get a sensible 
answer (even though the bare parameters m, A, ... will diverge in that limit). We then 
take those ‘sensible answers’ to be the predictions of the theory. 

In order to remove all infinities, it may be necessary to introduce new bare parameters 
by adding new terms to £. A quantum field theory is called renormalisable if this 
procedure terminates, that is if all Feynman diagrams will be finite after introducing only 
finitely many regularisors A; and renormalising the finitely many Lagrangian parameters 
appropriately. The ø model, QED and the Standard Model are all renormalisable. On the 
other hand, a quantum field theory for gravity in four dimensions, in the spirit of general 
relativity, is doomed to be nonrenormalisable. Renormalisability is a strong constraint 
on a theory — for example, it forbids fields with high spin and interaction terms involving 
many derivatives or products of many fields. For example, the only interaction terms 
allowed in the Lagrangian of a renormalisable four-dimensional quantum field theory of 
a single self-adjoint scalar ¢ are ø’ for 1 < i < 4 and X` 0,6 0“¢. 

A nonrenormalisable theory can always be renormalised (i.e. its divergences all 
removed) by adding infinitely many new terms to the Lagrangian (along with infinitely 
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many regularisors A;). The problem is that to fix the renormalised values of all those 
new coupling constants, we would need to perform infinitely many experiments. It would 
thus appear (and is often argued) that renormalisability would be a necessary condition 
for a physically relevant, predictive quantum field theory. Such a nonrenormalisable 
theory would display behaviour that is sensitive to the detailed structure at a much more 
microscopic level. This behaviour would appear random at the scale on which we are 
trying to focus. For a macroscopic example, consider the propagation of cracks in glass. 

On the other hand, it is possible that all but finitely many of those new parameters will 
arise in perturbation terms that will be insignificant until the energies of the particles are 
sufficiently large (e.g. they could involve new particles with very large masses). That is, 
the contributions from all but finitely many of those parameters could be exponentially 
suppressed and thus be ignored. Such a theory would be essentially predictive as long as 
we kept the energies of the collisions far less than the masses of these new and irrelevant 
particles. Such a nonrenormalisable theory would describe the low energy limit of a more 
fundamental theory — its nonrenormalisability arises because there is pertinent physics 
that is not yet accounted for, which occurs at a smaller, deeper scale. For example, 
quantum gravity could be the low-energy limit of string theory. 

In other words, nonrenormalisability could be the norm, as presumably all of our 
theories are merely limits of deeper ones. A renormalisable theory is merely one in 
which the deeper physics involves a much higher energy scale (equivalently, a smaller 
distance scale) than the ones attained in our present experiments. It is a happy accident 
that the Standard Model is renormalisable. For example, QED applied to a hydrogen 
atom (an electron moving about a proton) is renormalisable, but is nonrenormalisable 
when applied instead to a deuteron (an electron moving about a proton—neutron nucleus). 
The difference is that the physics describing the single proton concerns much smaller 
distances (approximately 107! cm) and higher energies than that describing the elec- 
tron’s motion in hydrogen (which involves distances on the order of 1078 cm), while the 
physics describing the deuteron nucleus also occurs at roughly the same 1078 cm scale. 

On aconceptual level, this renormalisation scheme is clearly unsatisfactory. The infini- 
ties appearing throughout renormalisation tell us that the fields and parameters appearing 
in £ are not only nonphysical, but are also nonmathematical. The former is not surprising; 
the latter gives powerful evidence that the Lagrangian approach to quantum field theory 
should be avoided. Nevertheless, it works: not only does it permit unambiguous numeri- 
cal predictions from the Standard Model, but those predictions match up admirably with 
experiment. 

It is easy to get the impression that, whatever its value may be to the pragmatic working 
physicist, renormalisation should best be avoided by the much more delicately disposed 
mathematician. Indeed much effort, though with comparatively little success, has been 
directed at nonperturbative quantum field theory. However, there are many situations 
where the mathematics arising in perturbation is fascinating. For example, the modular 
forms arising in string theory, and the Riemann surfaces of conformal field theory, arise 
directly in the perturbation expansion of string theory. Kreimer, Broadhurst and Connes 
(see [105], [361] and references therein) are studying the knot theoretic, Hopf algebraic 
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and number theoretic structure arising in perturbative quantum field theory. Perturbative 
Chern—Simons theories give both Vassiliev link invariants [38] and Gromov—Witten 
invariants (see e.g. the review [403]), depending on how it is perturbatively expanded. 
We know that what we call perturbative quantum field theory has direct relevance to both 
mathematics and physics; what hasn’t been worked out yet in a conceptually satisfying 
manner is its precise relationship with ‘true’ quantum field theory (whatever that is). 

This relationship is still mysterious after half a century of work. But recall that 
Newton’s calculus took well over a century to make mathematical sense, even though 
it gave good physics from the beginning. Dirac’s use of his delta functions was a much 
humbler example, but still took several years before Schwartz mathematically legit- 
imised them as distributions. Attempts to make direct sense of quantum field theory are 
discussed in Section 4.2.4. We are not merely discussing here the rigorous proof of phys- 
ical conjectures that are almost certainly true — the importance of that activity is easy 
to overestimate. Rather, we are speaking of making coherent, of finding the meaning 
of, quantum field theory. There have also been several proposals for a new mathematics 
underlying quantum field theory. For example, we have the Barrett and Crane interpre- 
tation of Feynman diagrams as morphisms in a tensor category (dynamics here comes 
from representations of the Poincaré group thought of as a 2-category), or Connes’ non- 
commutative geometry (where the geometry of space-time is replaced with an algebra 
of functions). Some of these approaches are discussed in [28]. 

Of course quantum field theory cannot be identified with perturbative quantum field 
theory. There are important nonperturbative effects, which cannot be seen in the pertur- 
bative expansion. Typical examples are quantum effects due to topologically nontrivial 
extended solutions to the classical field theory, such as magnetic monopoles (particles 
carrying magnetic charge) and instantons (solutions concentrated near a point in space- 
time rather than along a world-line as happens for particles). 

There are other challenges to the coherence of quantum field theory as it is prac- 
tised today. A famous example is Haag’s Theorem (1955), which is rigorously proved 
in the context of the Wightman axioms (see e.g. [518]). It says that, given the assump- 
tions built into the picture of quantum field theory sketched above, the S-matrix is 
very ill-defined unless the theory is free (which isn’t physically interesting). We know 
(Theorem 2.4.2) that there is a unique irreducible unitary representation of the finite- 
dimensional Heisenberg algebras, but this breaks down for infinite-dimensional ones 
(Question 2.4.2). Thanks to the equal-time commutation relations (4.2.9), the space- 
smeared fields g;(f) of a quantum field theory define at each time ¢ a unitary repre- 
sentation of an infinite-dimensional Heisenberg algebra (just use countably many test 
functions f with disjoint support). For a fixed quantum field theory, the representations at 
different times ¢ are unitarily equivalent via the time-evolution operator U (t) := eith, 
so each theory defines a unique fixed representation. Haag’s Theorem tells us that the 
representations for different values of the coupling constant will be equivalent only if 
the theories are equivalent. So if our theory is nontrivial, its Heisenberg representation 
will be different from that of the free theory, that is from that of our so-called asymptotic 
t — too theories. Thus the limits U (+00) can’t be well defined, and the justification 
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for quantum field theory as interpolating between incoming and outgoing states must be 
dropped (or at least seriously weakened). 

One escape is to throw away the equal-time commutation relations (Section 4.2.4). 
After all, we know that the renormalised (physical) fields won’t satisfy them. Also, it 
seems highly dubious to claim that (4.2.9) are physically relevant, if (4.2.9) permits us to 
smear fields only in the space direction. We should also smear in the time direction, which 
means we can no longer speak of equal-time relations and the simplicity of (4.2.9) will 
be lost. On the other hand, (4.2.9) are important, for example, for the usual interpretation 
of the number operator, and hence are central to the particle interpretation. 

The attitude taken by most practitioners of quantum field theory towards these various 
mathematical difficulties is much like that taken by the author of this book towards 
most of Life’s Little Crises: avoidance. “Tomorrow they may just go away.’ After all, 
this strategy worked fine with those monsters haunting the night-time shadows of our 
childhood. 

There are formal similarities between quantum field theory and (classical) statisti- 
cal mechanics. More precisely, path integral expressions in quantum field theory in 
d-dimensional space-time are the same as, or at least analogous to, thermal averages in 
statistical mechanics in d + 1 space-time dimensions, when the time ¢ is replaced by 
—ik/T where k is Boltzmann’s constant and T is the temperature. The weak coupling 
limit in quantum field theory corresponds to the high-temperature limit. Quantum fluctu- 
ations about a classical solution correspond to statistical fluctuations about a thermody- 
namic equilibrium. We won’t have much more to say about this connection, though it has 
been extremely fruitful. For example, spontaneous symmetry breaking in the Standard 
Model, needed to give masses to particles like the electron, is a phase transition. The 
Klein—Gordon equation, governing as we know scalar fields, also describes excitations 
of a dense plasma, or of vortex motions in liquid helium. Conformal field theories, as we 
shall see next section, can arise both from quantum field theories (string theories) and 
from statistical mechanics. Incidentally, the transition to imaginary time has an important 
place in quantum field theory, where it is called ‘Wick rotation’, and is related to the 
holomorphicity of the Wightman functions discussed in Section 4.2.4. 

The operators in both classical and quantum mechanics form an algebra. This cannot 
be directly true in quantum field theory, because the product of distributions is not usually 
a distribution. It does not make mathematical sense to multiply fields pı (x), ¢2(y) at the 
same space-time point x = y. Nevertheless, the Lagrangian density, as well as the equal- 
time commutation relations and many other familiar expressions in quantum field theory, 
do precisely that. Kenneth Wilson proposed the operator product expansion (OPE) as 
a way to make sense of this. As it is a standard tool of conformal field theory, we 
defer its treatment to Section 4.3.2. Wilson intended this OPE to be an alternative to the 
problematic (4.2.9), but as too often happens, his attempt at reformation was absorbed 
into The System and has become one of its standard tools. The other way to make the 
operators into an algebra is to smear them, and that is the approach taken by Wightman. 

Modern quantum field theory is based on the notion of a gauge symmetry. To help 
understand this important concept, consider the following toy model: a two-dimensional 


Quantum physics 267 


classical particle (x(t), y(t)), with equations of motion 


2 2 


d d 
qa) t+u(t)x(@) =0= qn) + v(t) y(t), (4.2.14a) 


for some fixed functions u, v. Writing z = x + iy and w =u + iv, this becomes the 
simpler 


pA 
ott + w(t)z(t) = 0. (4.2.14b) 


Of course, this system has a U;(C) symmetry, corresponding to a rotation of the z-plane: 
for any fixed e? €U,(C), z(t) is a solution of (4.2.14b) iff e z(t) is a solution. We 
call this a global (as opposed to local) symmetry, because et? must be constant if it is 
to define a symmetry of (4.2.14b). However, we can rewrite our system so that U; (C) 
becomes a local (time-dependent) symmetry. Introduce a function A(t) (which will serve 
as a book-keeping or compensating device) and replace each derivative d/dt in (4.2.14b) 
with the differential operator d/dt — iA(t), so (4.2.14b) becomes 


(= = iao) ($ = iao) z(t) + w(t) z(t) = 0. (4.2.14) 


This system (4.2.14c) has a local U;(C) symmetry: for any smooth function 0 : R > 
U,(C), (z(t), A(t) is a solution to (4.2.14c) iff (e!z(t), A(t) + 400) is a solution 
to (4.2.14c). Physically, this local symmetry corresponds to the freedom of rotating the 
system (or the observer) differently at each moment of time. We know from elementary 
physics that doing this requires introducing the centrifugal forces intimate to all amuse- 
ment park aficionados. Indeed, we can think of (4.2.14c) as being the equation of motion 
of a particle z under the influence of a new external force described by A, in addition to 
the original force described by w. This is the origin of the ‘new external force A. 

For historical reasons, local symmetries such as the U; (C) of (4.2.14c) are called 
‘gauge symmetries’ (gauge here means calibration or scaling). What is significant here 
is that ‘gauging’ a global symmetry associates with it a new force; changing the gauge 
(e.g. rotating the z-plane) is indistinguishable from the action of an apparent force (e.g. 
a centrifugal one). In the trivial example given above, the force is globally ‘fictitious’ 
and the gauging process (4.2.14b) — (4.2.14c) involves no new physics, since we can 
always solve A(t) + (t) = 0 for 6 and thus ‘gauge away’ the force A. 

Remarkably, all fundamental forces in Nature (namely, gravity, electromagnetism, 
and the strong and weak nuclear forces) can be obtained by gauging a global sym- 
metry. Consider first special relativity (Section 4.1.2) and for simplicity a single free 
particle x(t). There, the Poincaré group acts as a global symmetry. It says that the laws 
of physics shouldn’t depend on the choice of origin and inertial observer (coordinate 
axes). It is a global symmetry, in the sense that once those two choices are made, all 
observers (regardless of the space-time point x they animate) must agree to use that 
same origin and coordinate axes in comparing their observations, in order to have a 
symmetry. This rigidity, this global collaboration, seems physically artificial. What hap- 
pens if we gauge this symmetry? That is, permit each observer (i.e. each space-time 
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point) to independently choose an origin and coordinate axes. What does that anarchy 
mean for our description of the relativistic particle? Simply that its coordinates will have 
changed: x(t) œ> x'(t) = a(x(t)) where @ : R?! —> R?! encapsulates our new gauge. 
We require this global change of variables to be invertible, that is to be a diffeomorphism 
of Minkowski space. So our choice of gauge reduces to a choice of diffeomorphism a. 
Making the equation of motion independent of that choice «œ requires introducing book- 


keeping functions, As so that the original equation of motion d*x*/dt? = 0 becomes 


d2x" dx’ dx’ 
= Pee = 
dt? > W dt dt 


Requiring this equation to be equivalent to the original one, we recognise that the com- 
ponents A; are (up to a sign) none other than the Christoffel symbols T tes and that 
the equation of motion is simply the geodesic equation. The new force corresponding 
to these A’s is identified by Einstein’s equivalence principle with gravity. The question 
of whether gravity can be ‘gauged away’, that is whether it is globally fictitious and our 
calculations have been merely a formal mathematical game, reduces to the question of 
whether space-time is globally flat. It is here — allowing for the suddenly natural possi- 
bility that space-time is not flat — that new physics enters. The real purpose of gauging 
the symmetry of Minkowski space-time (Einstein’s requirement of “general covariance’) 
was to lead us to the idea of curved space-time and the associated force (which by 
independent reasoning we identify with gravity). More generally, gauging is a guide for 
introducing a new force into a theory with a global symmetry: the so-called principle of 
minimal interactions. 

Gauging works similarly in quantum field theory. QED results from gauging the U; (C) 
symmetry of free theories. The global U; symmetry, W(x) > e w(x), corresponds to 
the ambiguity of defining the phase of, for example, the electron field y. Once we 
make the choice at one space-time point, then we must be consistent at all other points. 
Incidentally, that global symmetry leads to the conservation of global electric charge, by 
Noether’s Theorem. Gauging it means the phase can be changed arbitrarily at each point, 
that is 0 can depend on x. The associated book-keeping field A,,(x) corresponds to the 
force we call electromagnetism, and the gauge symmetry implies /ocal conservation of 
charge. For example, in the case of a charged scalar particle, the Klein—Gordon equation 
(4.2.7) gauges to 


So nly — iA )y — iAy) — m$ = 0. 
Lv 


Itis straightforward to construct a Lagrangian from the original (free) one, which yields 
the new equations of motion: for example, the free Lagrangian )°(0,,6')(0“@) + moo 
for a scalar field with charge e yields 


X On + ieAy)o! O" + ie A") +m o'o. 


But how should we think of A,,? As another elementary field in the theory. But that 
means we should add a new term to the gauged Lagrangian, containing partial derivatives 
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of A (otherwise the Euler-Lagrange equations (4.1.8) would be trivial). The simplest 
gauge-invariant, Lorentz-invariant way to do this is (4.1.13) (with V = 0),where Fay = 
ð Av — ðA, is called the field strength. This is the correct Lagrangian describing the 
QED of a charged scalar particle. Changing the gauge is indistinguishable from the 
matter field moving through an electromagnetic field. The associated perturbation theory 
involves, in Feynman’s language, the exchange of virtual particles associated with this 
new A,, field — those new particles are called photons. 

General relativity tells us to expect a geometric picture here, and indeed that is the 
case. We think of the matter fields as being sections of a fibre bundle with base R°! and 
fibre U, (C); the electromagnetic field A, defines a connection for this bundle and F,» 
is the curvature tensor. 

Similarly, the Standard Model is a gauge theory associated with the gauge group 
SU3(C) x SU2(C) x U;(C). SU; here corresponds to the strong nuclear force, respon- 
sible, for example, for the binding of quarks together to form protons and neutrons, and 
the binding of protons and neutrons together to form nuclei. SU2 x U, describes a unifi- 
cation of electromagnetism with the weak nuclear force (which describes, for example, 
the decay of the neutron). What this symmetry group SU3 x SU2 x U; means physically 
is less clear than it was for general relativity (or QED), and so the Standard Model lacks 
the conceptual clarity of Einstein’s masterpiece. For example, many believe a deeper 
quantum field theory will involve a larger gauge group, such as E6. 

Describing other important ingredients of the Standard Model — the fundamental fields 
and how they transform under SU3 x SU2 x U; — would drag us even further from the 
main thread of this book. For detailed treatments of the Standard Model see, for example, 
[310], [555]. Although its comparison with experiment has been fabulous, it is surely not 
the ‘final theory’. For one thing, it suffers from all the conceptual and mathematical flaws 
mentioned in this subsection. Also, it has 18 free parameters — for example, the electron 
mass — which must be experimentally determined and (depending on how one counts) 
there are 61 ‘elementary’ particles in the theory. The Standard Model is an effective 
theory, valid only for a relatively narrow range of physics. The question is, how different 
from it will the theory superseding it look? 

Quantum field theory challenges our concept of matter. In Newtonian physics reality 
obtained its solid objective structure from an inert unanalysable ‘stuff’, from which all 
substance came; though it could change form (e.g. ice to water), it was the clay on which 
the Laws of Physics acted. As we moved into the twentieth century we learned that this 
clay could be transformed into energy (‘E = mc?’), and that it is composed of atoms 
that are mostly empty space. Quantum field theory goes a step beyond: the particles 
composing atoms are to empty space like sound waves are to air. Bertrand Russell was 
more accurate than he thought when, in 1956, he compared matter to Lewis Carroll’s 
Cheshire Cat which gradually faded until nothing was left but the grin — matter’s grin, 
Russel speculated, was caused by amusement at those who still think it’s there. 

Likewise, our notion of force has changed from Newton’s definition F = ma, to some- 
thing that more generally changes the state of a particle, and that is due not to an active 
agent but to an indirect effect like a well-hidden symmetry — a further movement of 
physics away from the prerelativistic infatuation with intuitive space and time. 
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4.2.3 The meaning of regularisation 


The mathematics of classical physics (symplectic geometry) is well understood, while 
that of quantum field theory isn’t. But it’s already clear that, mathematically speaking, 
quantum field theory is by far the more profound. Much as mechanics helped develop 
calculus, our standard tool for studying finite-dimensional systems, we can expect quan- 
tum field theory to supply us one day with sophisticated new tools for studying infinite 
dimensions. We are already seeing hints of this. 

To a theoretical physicist, quantum field theory is a recipe book, an infinite sequence 
of finite calculations. To a mathematician, these recipes seem ad hoc, and surprisingly 
classical and finite-dimensional for something that is emphatically neither. A hundred 
years from now we’ll look back at that recipe book much as a modern doctor reflects 
on medieval medicine: this herb is antiseptic, that incantation is mostly harmless, but 
leeches and blood-letting were simply bad ideas. 

Of all these recipes, those connected with renormalisation and regularisation generate 
the most ire. For example, even mathematical stoics cannot be unmoved by the substi- 
tution (2.3.1). Yet it is in these places where most of the magic lives, as for example the 
derivation of the Atiyah—Singer Index Theorem from anomaly cancellation indicates. 

It isn’t difficult for a mathematician to appreciate the inevitability of some form of 
renormalisation. Consider, for example, the two-body Lagrangian 

Ea? Se ee 


ULEN 4.2.15 
2 2 ix) — x9 (etn 


We can integrate out one of the particles, since the centre-of-mass m x, + M2X2 is 
constant (without loss of generality, say it equals 0). The resulting one-particle system 
is 


l1 > k 
L = -m^ + —, (4.2.15b) 

2 Ix| 
where m = m\(m,; + m2)/mz and k = Gmm3/|m2 +m,|. We say that the mass and 

coupling constants — the ‘bare’ parameters in (4.2.15a) — have been ‘renormalised’. 

Something similar happens whenever we integrate away degrees-of-freedom, or 
account for some effect (e.g. the unavoidable geometric series in Figure 4.10): the new 
parameters will be readjusted or renormalised compared to the old ones. This is com- 
pletely noncontroversial. What is disturbing about renormalisation in quantum field 
theory is that you are asked to add/subtract/multiply/divide infinite quantities. Regulari- 
sation is the procedure of obtaining precise numbers from such an ill-defined operation. 
In some sense, regularisation also arises in mathematics. We see it in our Dedekind 
eta calculation in (2.2.9), or the Virasoro action on affine algebra modules in (3.2.13). 
Sometimes analytic concerns become significant (e.g. the natural integrals or series one 
would naively write down turn out to diverge). If those concerns are ignored, we obtain 
incorrect answers (such as n(—1/t) = n(7), or an action of the Witt algebra on affine 
algebra modules). Of course what we must do is go back and do the analysis properly. 
Regularisation is merely a symptom of sloppy analysis. It isn’t supposed to be the place 
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where the magic appears. The magic was there all along. But the penalty of pretending 
that (semi-)classical calculations can capture quantum field theory is the introduction of 
regularisation schemes. The classical calculations fail to pick up that magic, which is 
then forced to arise in that final step. It’s like trying to straighten a Möbius band: as you 
move your hand around the strip, trying to keep the paper vertical, the twist is relegated 
to a smaller and smaller portion of paper until eventually the paper tears. That tear is 
called regularisation. The problem isn’t inherent to quantum field theory, the problem is 
with the fantasy that we can treat quantum field theory semi-classically. 

Feynman once asked why the same tricks work over and over in physics. Regularisa- 
tion is Nature’s way of telling us that they don’t quite. Unfortunately, we don’t yet know 
how to go back and do the quantum field theory calculations properly. But regularisa- 
tion must supply some deep hints. For instance, the presence of infinite renormalisation 
seems to suggest that quantum field theory should be formulated without Lagrangians. 
Perhaps another hint is that the point œ is the difference between the (Riemann) sphere 
and the (complex) plane, suggesting that regularisation can be interpreted as a (global) 
topological effect. In [105], [106], a projective limit of certain Lie groups, corresponding 
to the Hopf algebra of Feynman graphs, acts on the coupling constants of renormalis- 
able quantum field theories, and contains the renormalisation group as a one-parameter 
subgroup; dimensional regularisation can in some theories be interpreted as the index 
theorem in noncommutative geometry. 


4.2.4 Mathematical formulations of quantum field theory 


Making rigorous sense of quantum field theory is very difficult, as several comments 
made earlier should indicate. Even the free theories are very subtle; theories with inter- 
actions are filled with unresolved problems (Section 4.2.2). One thing is clear: quantum 
field theory as it is typically practised today (i.e. the informal theory) is mathematically 
incoherent. 

However, quantum field theory is a part of mathematics in the sense that important 
aspects of it have been encoded axiomatically and several examples (mathematically 
if not physically interesting) have been rigorously constructed. Mathematicians under- 
appreciate just how accessible quantum field theory is. The purpose of this subsection is 
to briefly describe two of the most influential of these mathematical treatments. These 
lead to two different formulations of conformal field theories, which we study in later 
chapters. The fundamental difficulty in the subject lies in rigorously constructing nontriv- 
ial examples of quantum field theories within these formulations. Only the very simplest 
theories (e.g. the free ones) have been rigorously constructed. 

The simplest and best-known mathematical treatment of quantum field theory, the 
Wightman axioms [518], was first formulated in the 1950s by Garding and Wightman. 
Lagrangians and the equal-time commutation relations (4.2.9) are avoided, and instead 
attention is focused on the interpolating renormalised ‘physical’ fields. This makes rigour 
much easier to attain, but contact with the particle interpretation is more difficult. One 
unexpected gain is the holomorphicity of the vacuum-to-vacuum expectation values. 
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According to Wightman, a quantum field theory consists of the data collected in the 
following seven axioms w.I—-w.vil. For convenience, put c = fi = 1. Naturally, there is 
much overlap with the preceding material — the main clarification provided here is what 
from Section 4.2.2 can (and should?) be avoided. 


w.I. (relativistic state space) Let H bea separable Hilbert space, carrying a continuous 
unitary representation U(q ,) of the universal cover IR*>«SL,(C) of the Poincaré group. 
Define the self-adjoint operators P” by Ua, n = expli > u P"a,); they mutually com- 
mute so we can speak of simultaneous eigenstates. All the (simultaneous) eigenvalues 
p” of P” are required to satisfy the conditions p* > 0 and X u Pup” <0. 


w.1. (vacuum) There is a state |0) € H, unique up to scalar multiple, invariant under 
all Uia, A). 


w.il. (fields) There is a space D C H, dense in H and containing |0}. There are a finite 
number 9, ..., @y Of operator-valued tempered distributions over space-time R4, such 
that for any ‘test function’ f € S (R*), each 9;( f) is an operator from D to D. The set 
of fields g; is closed under adjoint (i.e. g! equals some g;). 


w.lv. (covariance of fields) For all (a, A) e R*xSLy, Ua, (D) = D. Equation 
(4.2.8a) holds in D, and so the matrices V (A) define an M-dimensional SL2(C)- 
representation. 


Physically, the vectors in H (or rather the rays) are interpreted as the possible states 
of the theory, and the g; are the (renormalised interpolating) quantum fields. We discuss 
tempered distributions and the Schwartz space S in Section 1.3.1, and the Poincaré and 
Lorentz groups and their doubles in Section 4.1.2. If there are any other symmetries of 
the theory, then H will also carry a unitary projective representation of those groups. The 
energy-momentum operators P”, generating space-time translations, exist because of 
the assumed unitarity of the U’s. They mutually commute because their exponentiations 
Uap do. Up toa factor of c?, the eigenvalue p4 is the energy of the state and \/— © py p“ 
its mass m. We call the vector |0) € D in w.u the vacuum, and normalise it so that 
(0|0) = 1. 

Postulating a common domain D is necessary because (Section 1.3.1) unbounded 
operators on a Hilbert space aren’t defined everywhere (think of differentiation on the 
space of square-integrable functions L7(R)). We see from w.11 that D certainly contains 
the vectors obtained from the vacuum |0} by applying all polynomials in the smeared 
fields y;(f), and we learn in w.vı below that those vectors p(g(f))|0) are indeed dense 
in H. To some approximation, D can be identified with that subspace (see page 98 of 
[518]). 


w.v. (local commutativity) For any pair of test functions f, g € S(R*) satisfying 
f(x) g(x) = 0 whenever (x — y)? > 0 (in other words, the supports of f and g are 
space-like separated), then for any fields g;, øj, a sign + (depending on i, j) can be 
chosen so that on D 


[ei(f), Gj (gz = pple) E OQ) Gi) = O. 
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W.VI. (completeness) The vacuum is cyclic for the smeared fields. That is, polynomials 
in the smeared fields g;(f), applied to the vacuum |0), form a subspace dense in H. 


Completeness w.vı implies irreducibility of the smeared field operators, in the fol- 
lowing sense (inspired by Schur’s Lemma): if B : D —> D is a bounded operator 
satisfying 


(u, Bg;(f)v) = (gi(f)*u, Bv), Yu, v € D, Yf € S(R‘), Vi =1,...,M 


(so in this weak sense B commutes with all g;), then B is a constant multiple of the 
identity. Completeness corresponds here to the remark in Section 4.2.2 that any operator 
in the theory can be expressed as a function of the smeared fields. 

Physically, local commutativity w.v concerns the quantum mechanical fact that mea- 
surements localised at space-time points x and y should commute (i.e. be simultaneously 
measurable without mutual interference) when x and y are space-like separated. It is a 
consequence of the axioms which sign to take, as is discussed below. 

A final axiom is needed to make content with particles (that is to say, with experiment). 
As it is more technical, it is often avoided in treatments of Wightman’s axioms, and we 
too will be sketchy. The basic idea is that any single particle state |A) € H (as usual, 
à = A(p) describes the decomposition of the state into momentum eigenstates |p)) will 
be an eigenvector for the operator }~ u P” Py, with eigenvalue —m?c? independent of À 
(m is the mass of the particle). On the other hand, eigenstates |A,,..., An) of X P’ Py, 
corresponding ton > l particles will have eigenvalue varying continuously with the 4;. In 
other words, considering the spectral decomposition of the self-adjoint operator }> P“ P,, 
in H, the single particle states |A) correspond to the discrete part of the spectrum. Call 
H the Hilbert space they span — it is a proper subspace of H. There need be no direct 
relation between the number of elementary fields g; and the types of single particles. For 
example, in the Standard Model quarks correspond to elementary fields but not particles, 
and protons are particles without a corresponding elementary field. We can now construct 
incoming |A;,..., Àn)” and outgoing |À1, ..., A») n particle states, corresponding in 
the t + Fœ limits to tensor products |A;) Q --- Q làn) — see section II.V of [269] for 
the detailed construction. Then the final axiom is: 


W.VII. (asymptotic completeness) The incoming particle states |A,, ..., An)!" topolog- 
ically span H, as do the outgoing particle states |A,,...,A,)°". 


Unfortunately, this treatment requires all particles in the theory to have nonzero mass, 
and so isn’t realistic. For example, in quantum electrodynamics the photon is massless 
and the electron is always surrounded by a cloud of photons, so the single electron states 
don’t belong to a discrete eigenspace of the operator X` P“ P,,, but rather the eigenvalue 
varies continuously with upper bound —m?c? corresponding to the mass of the electron. 
For a more sophisticated treatment of the particle concept within quantum field theory, 
see chapter VI in [269]. 

The role of the n-point functions (4.2.11) are played here by the Wightman func- 
tions, which are also vacuum-to-vacuum expectation values but aren’t time-ordered. Let 
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Qi -- -> Pi, be n fields, not necessarily distinct. Define W, to be the inner-product 


W, (xı, e.’ Xn) = Woi 5 say Qin xı, e.’ Xn) := (Olp (x1) ERT Pi, (Xn)|0). 


Of course, to make sense of this expression we must smear the points x;, that is, 
replace them with test functions f;. Thus W, is a complex-valued function of S(R*) x 
--- x S(R*). Thanks to Schwartz’s Nuclear Theorem, W,, has a unique extension to a 
tempered distribution on S (R*"), and it is this extension that is studied. Nevertheless, 
the inaccurate and occasionally misleading notation W,,(x1, .. ., Xn) is too standard to 
change. 

It is possible to convert the data and properties in w.I—w.vII into constraints on the 
Wightman functions. For example, the relativistic invariance of the vacuum leads to the 
expression, valid for any (A, a), 


M 


nea 


gv, (Axı +4,..., AX, +). (4.2.16) 


As always of course, everything should be smeared, that is evaluated at f; € S(R*) (or 
f € S(R*)). In its unsmeared form, (4.2.16) suggests that W, is actually a ‘generalised 
function’ w,(&,..., &;-1) of the differences é; = x; — x;+1; the precise statement and 
proof for smeared W, is given in pages 39-40 of [518]. 

A central result (due to Wightman) is the Reconstruction Theorem: these vacuum-to- 
vacuum functions W, uniquely determine the quantum field theory. More precisely, if 
a collection of tempered distributions W, satisfies all of the ‘obvious’ properties (such 
as the covariance (4.2.16)) that the set of all Wightman functions should obey, then the 
Hilbert space H and the various fields g; obeying axioms w.I—w.vI can be constructed, 
and moreover any quantum field theory realising the given Wightman functions will be 
equivalent to the one constructed. The general proof is notationally laborious though 
fairly straightforward (it is closely related to the Gel’ fand—Naimark—Segal construction 
of a Hilbert space H, and a representation 2, of a C*-algebra A, associated with a 
functional p : A — C). See section 3-4 of [518] for the explicit statement and proof for 
the theory of a single free boson. The Reconstruction Theorem does not tell us when 
w.VII (i.e. the particle interpretation) holds. 

Wightman also proved another remarkable property of his functions: each ‘gener- 
alised function’ w,(&1,...,&)—-1) is the limit as z; —> & of a holomorphic function 
Wn(Z1,---,Zn—-1) Of complex variables z; € C*. The domain of holomorphicity con- 
tains the following points: Re(z;) can be arbitrary but y; := Im(z;) lies in the forward 
light-cone (i.e. yí > Qand y; - y; < 0). So the distributions W, (xı, ..., Xn) are boundary 
values of the holomorphic functions wy(z1,...,Zn—1). The proof of this is not difficult, 
and involves writing w,(Z1,...,2Zn—1) as the Laplace transform of the Fourier trans- 
form of w,(&1,..., E&n—1). Physically, this amounts to holomorphically extending from 
real time (i.e. the Minkowski space-time of physics) to imaginary time (i.e. Euclidean 
space-time, with better analytic properties). 
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As mentioned earlier, the choice of sign in w.v is fixed. In particular, if p; and @2 have 
spins sı and s, then we take the sign —(—1)°?»), In small space-time dimensions 
alternatives to bosons and fermions are possible — see section 4.3.5 below — but these 
exotic possibilities are precluded here by the local commutativity axiom. 

Apart from free theories, very few quantum field theories obeying the Wightman 
axioms have been constructed. In 1953, Thirring rigorously constructed the first inter- 
acting theories, but these live in two-dimensional space-time. In the 1960s and 1970s 
several nontrivial theories with interactions (e.g. a single scalar with ¢* interaction term) 
were constructed in three and especially two space-time dimensions. One of the $1 mil- 
lion Clay Institute problems (see http://www.claymath.org/) is to rigorously construct 
four-dimensional gauge quantum field theories. Quite probably there are easier ways of 
becoming a millionaire. 

In the 1960s Haag and Kastler proposed a different axiomatic approach to quantum 
field theory, which although more abstract and complicated, appears to be more flexible. 
We will only sketch it here — see the excellent book [269] for a complete treatment, as 
well as several insights into general quantum field theory. This approach avoids fields, 
focusing instead on the algebra of observables — as the existence of very different-looking 
but equivalent field theories emphasises, it is the observables and not the fields that have 
a direct physical meaning. Remarkably, the entire physical content of the theory can be 
recovered from these algebras of observables. 

Their starting point is to associate with each bounded open set © in space-time R*!, 
a von Neumann algebra A(O) of bounded operators on a fixed Hilbert space H. This is 
the same state space H as in the Wightman axioms, but its role here is much more minor. 
The self-adjoint elements in A(O) correspond to the measurements performable within 
the region O, and so QO; C O2 implies A(O,;) C A(O2). If fields g were present, A(O) 
would be obtained from polynomials in the smeared fields g(f), for test functions with 
support in ©. Conversely, one may hope to define fields g(x) by sending © — {x}. Thus 
this approach is related to that of Wightman, and it shares with the latter the near-absence 
of nontrivial examples. 


Question 4.2.1. The nonrelativistic analogue of the Poincaré group is the Galilei group, 
generated by all translations (Ax, Aż), all rotations R € SO; and all ‘boosts’ in velocity 
Av € R?, as in (4.1.7b). Galilean invariance for nonrelativistic quantum mechanics says 
that, for any element a = (R, Av, Ax, Afr) of the Galilei group, a wave-function w(x) 
satisfies Schrddinger’s equation (4.2.1) iff the corresponding transformed wave-function 
w’'(x’) (whatever that is) satisfies 
OW'(x’) i? 
at’ Om 
where x’ = a.x = (tAv + Rx + Ax, t + At) as usual. Show that the obvious trans- 
formation formula y’(x’) = w(x) (corresponding to a nonrelativistic scalar) fails 
here. Rather than transforming in a representation of the Galilei group, y must 
transform in a projective representation. Show that the transformation law w'(x’) = 
exp[iAy(x)/R] W(x) works, where Ag(x) = m (Av) -x + Z (Av) t. 


Vw a) + VR) Y'O’, 


i 
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Question 4.2.2. Let Vo be a constant. Solve the one-dimensional Schrödinger equation 


for — 1 1 
ddis orie oen rea oO 
0 otherwise 


w and dy be continuous at x = +1. 


, with the condition that both 


Question 4.2.3. (a) The vacuum |0) for the harmonic oscillator is the state with minimum 
possible energy. Find its normalised wave-function $(x, t). (See equations (4.2.3).) 

(b) Use your answer in (a) to find the average value (expectation value) f w*k*w of the 
observable £* in the vacuum. 

(c) Now do the same calculation using the Heisenberg picture (4.2.5): calculate the 
expectation value (0|£+|0) using creation/annihilation operators. 


Question 4.2.4. (a) In nonrelativistic quantum physics, the current density is jx) = 
x (YYY —(Vw)y) and the probability density is p(x) = |Y (x)|?. Verify that they 
obey the equation of continuity d9/dt + V - j = 0. (The equation of continuity says that 
the spatial integrals f p(x, t) dx are independent of t.) 

(b) Suppose @ was a wave-function obeying the Klein—Gordon equation (4.2.7). The 
relativistic version of (j, o) is j“(x) = a (¢ 04 — (0")o). Verify that this obeys the 
relativistic equation of continuity )> u ĉu J} = 0, but that the corresponding probability 
density jf is not positive. (This is the first sickness of relativistic quantum physics based 
on the Klein-Gordon equation. The reason for these negative probabilities is that j4 
involves a time derivative, due to the Klein—Gordon equation being second order in 
time.) 

(c) Verify that (x) = exp[—i }_ k„x”] satisfies the Klein—Gordon equation and is also 
an eigenfunction of energy and momentum, provided k and m are related in a certain 
way. Verify that negative energy solutions to the Klein—Gordon equation do exist. (This 
is the second, related sickness.) 


Question 4.2.5. Mathematically speaking, bounded operators are much nicer than 
unbounded ones. Explain why, physically speaking, we don’t lose any generality restrict- 
ing to bounded self-adjoint observables. 


4.3 From strings to conformal field theory 


In this section we introduce rational conformal field theory (RCFT), as it is known in 
physics. Standard references for this material are the book [131] and the review articles 
[239], [209], [224]. We also touch on one of its motivations: string theory. A more 
mathematical treatment of RCFT is provided in the following section. 

We essentially identify conformal field theory (CFT) and perturbative string theory, 
but this is an oversimplification. For instance, a string theory exists simultaneously on 
several Riemann surfaces, and the corresponding amplitudes are added together. These 
surfaces correspond to the various terms in a perturbative expansion (a Taylor series 
in the string tension parameter T ) of the true physical amplitudes. In string theory, the 
quantities for each surface are of no direct significance by themselves, any more than the 
term ‘196 884q’ by itself means anything special to SL2(Z). In CFT, on the other hand, 
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the Riemann surface is fixed — for example, the theory on the torus could be realised 
by a statistical mechanical model on the plane where the fields obey doubly-periodic 
boundary conditions. In fact, it is the deep connection to string theory that gave conformal 
field theorists the compulsion to explore their theories in arbitrary genus. 

Conformal field theory and string theory have impacted remarkably on mathemat- 
ics. For instance, five of the twelve Fields medals awarded in the 1990s were to men 
(Drinfel’d, Jones, Witten; Borcherds, Kontsevich) whose work directly concerned aspects 
of CFT. Probably no other structure has affected so many areas of mathematics in so 
short a time. Moonshine (and this book) have been deeply influenced by CFT. 

The impact so far on physics has been less profound. String theory is still our best 
hope for a unified theory of everything, and in particular a consistent theory of quantum 
gravity. It goes through periods of boom and periods of bust, not unlike the breathing of 
a snoring drunk, and it is still too early to draw any definite conclusions. 

However, recall Dirac’s quote in Section 1.2.2 about the deep relation between math- 
ematics and physics. For example, the inverse-square law (‘force is proportional to 
|x — y|~?’) is so mathematically elegant that it must play a role in physics, at least in 
certain limiting situations. We see it in Newton’s gravitation, and the Coulomb force 
between electric charges, and we now understand it to be the effective macroscopic 
theory associated with a massless boson in an abelian gauge theory. The same, it can be 
argued, should be true with string theory. !! 


4.3.1 String theory 


The Standard Model describes the quantum theory of the electromagnetic, weak and 
strong forces. It ignores the force that to us plodding behemoths is the most blatant: 
gravity. The direct approach to quantising gravity fails: the resulting quantum field 
theory is easy to write down but it is nonrenormalisable and computationally useless. 
This strongly suggests that new physics should be entering in at high energies (= small 
distances). Indeed, naive calculations involving general relativity (which relates energy 
densities to the space-time metric) suggest that as we zoom in on space-time at distances 
of around 10733 cm (the so-called Planck length), the virtual quantum oscillations will 
change the topology of space-time. Far from being a continuum (manifold), space-time 
at small scales would seem to be some sort of quantum foam. 

Because this issue is so fundamental, there are several approaches to resolving it. One 
of these is string theory, which was created by accident in 1968, where it was applied to 
the wrong problem, and gave, it was soon realised, the wrong answers. The explosion 
of interest in it as a theory of quantum gravity, and everything else, began in 1984. 

The electron is a particle, that is, it can be localised to a point. The Standard Model, 
say, contains several other equally fundamental particles, each distinguished by different 
abstract assignments (e.g. representations) attached to that point. In string theory, the 


11 T owe this thought to Peter Goddard. 
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(a) (b) 


Fig. 4.11 Some two-loop Feynman diagrams of (a) particles and (b) strings. 


fundamental object is a string (i.e. a finite curve of length approximately 1073? cm). 
Depending on the particular theory, this string can be open or closed, oriented or unori- 
ented. 

There are several advantages to having extended objects. One is that the particle zoo 
is simplified, as those abstract assignments can be modelled geometrically using the 
changing shape of the string. For example, the difference between a string realising an 
electron, and a string realising a photon, is in how it oscillates. In place of the several 
dozen ‘elementary particles’ of the Standard Model, we have only one string, whose 
precise physical properties at a given time depend not only on its momentum but also its 
vibrational mode. Likewise, the possible interactions are simplified. Recall that to each 
term in a particle Lagrangian £, we have a possible vertex for the Feynman diagrams of 
perturbation theory. On the other hand the interactions of strings are purely topological: 
for example, a single string can split into two, or two join into one. Most importantly, 
a theory of quantum gravity seems to arise naturally and seems far better behaved than 
other quantum theories of gravity. 

The weary reader may wonder whether future physicists could initiate new ‘revo- 
lutions’ by replacing strings with membranes or other higher-dimensional manifolds. 
Such a reader may find some solace in the No-Go theorem described in chapter 2.1.1 
of [261]. Nevertheless, modern string theory interprets D-branes (membranes where the 
endpoints of open strings reside) as dynamical objects in their own right, correspond- 
ing to higher-energy semi-classical solutions. Just as for low-energy approximations we 
study perturbations about a vacuum, for higher-energy approximations we need to study 
perturbations about D-branes. It is hoped (though with little justification) that together 
those perturbative patches cover all of parameter space. 

The Lagrangian of a free particle says that the classical particle travels in such a way 
that its arc-length is minimised. The natural analogue for a string says that the classical 
string tries to minimise the area of the surface (‘world-sheet’) it traces out. This Nambu-— 
Goto action describes what we now call the bosonic string. An equivalent formulation, 
called the Polyakov action, expresses it as an integral over moduli space. 

We are interested in perturbative string theory. Recall (4.2.13d). Figure 4.11 gives 
some two-loop Feynman diagrams arising in the scattering of two particles/strings. As 
usual, we take the incoming and outgoing states to be asymptotic (this simplifies things 
considerably). For simplicity, make the particle theory ¢? (so the diagrams are trivalent) 
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Fig. 4.13 The punctured surface corresponding to Figure 4.11(b). 


and the string closed. For the particle, both diagrams in (a) would contribute a term. For 
the string, the equality in (b) reflects the fact that in Polyakov’s formulation, conformally 
equivalent world-sheets correspond to the same term in the perturbative expansion, and 
should only be counted once. This is why the Feynman sum reduces to an integral over 
moduli space (in this case Wy 4). 

In any quantum field theory, each vertex v contributes some operator 1%, to that per- 
turbation summand. To what does this correspond in (b)? We obtain our ‘vertices’ by 
dissecting our world-sheet into spheres with three legs (‘pairs-of-pants’), as in Fig- 
ure 4.12. The operator in string theory is called a vertex (intertwining) operator. It 
is a local operator describing the absorption or emission of a string state by another. 
Surprisingly, these vertex operators are central to the rest of our story. 

Because we’re really interested in asymptotic t — +0 initial/final states, the exter- 
nal tubes of the world-sheets are semi-infinite. We can conformally shrink those tubes 
into punctures (one for each incoming/outgoing string), so Figure 4.11(b) becomes Fig- 
ure 4.13. The easiest example of this map is also the most important: send a cylindrical 
world-sheet, with local coordinates —oo < t < Oand 0 < 0 < 27, to the complex plane 
using (t, 9) +> z = e'—®; then the cylinder goes to the unit disc and t = —oo corresponds 
to the puncture at z = 0. It thus suffices to consider world-sheets that are compact sur- 
faces, with marked points indicating the external lines. The data of those external string 
states are stored in the appropriate vertex operator attached to that point. This is one of 
the remarkable features of string theory: that space-time string amplitudes (in, for exam- 
ple, 26 dimensions) can be expressed as correlation functions (4.2.11) in a point-particle 
quantum field theory in two dimensions, where the fields are vertex operators. 

String theory is important to Moonshine because modular functions arise there. That 
amplitudes in string theory could be modular functions was known almost from the 
very beginning, and by 1971 we even knew the modern geometric explanation: one-loop 
vacuum-to-vacuum amplitudes in string theory are path integrals f Z (torus) d[torus] over 
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conformal equivalence classes of tori; because the moduli space of tori is H/SL2(Z) 
(Section 2.1.4), this makes the modularity of Z(t) := Z(C/(Z+1Z)) manifest. The 
meromorphicity of the amplitudes at the cusps follows from the good behaviour (‘fac- 
torisation’) of the amplitudes when the surface is deformed into one with nodes (Sec- 
tion 2.1.4). In short, modular forms and functions appear very naturally in perturbative 
string theory. Elsewhere, especially Section 7.2.4, we study why this is in more depth. 

The modularity (Theorem 3.2.3) of the affine algebra characters x, arises from strings 
living on the corresponding compact simply-connected Lie group G (this is the so-called 
Wess—Zumino—Witten model). Likewise, quadratic moonshine (i.e. the modularity of 
theta functions) arises from the theory of strings living on the torus R” /L. There is also 
a string theory responsible for the modularity of the j-function (0.1.8). Much of the 
remainder of this book tries to explain this. 

It is often argued that string theory makes no experimental predictions, other than 
the dimension of space, which it over-estimates by a factor of 3. This is perhaps a little 
unfair. String theory predicts a world qualitatively much like that we observe: a world with 
quantum gravity governed by Einstein’s equations at the low-energy, long-distance limit, 
and gauge groups large enough to include the Standard Model with its zoo of particles. 
String theory also seems more finite than usual quantum field theories. Unlike the 18 
adjustable parameters of the Standard Model, and the fairly arbitrary choices of gauge 
groups and particles possible in quantum field theories, there is a unique (M-)theory! 

But that too is a little dishonest. There are enormous numbers of classical solutions, 
and each of these serves as a possible vacuum to perturb about. Each choice of vacuum 
corresponds to a different effective dimension of space-time, gauge group, etc. — different 
physics. So the problem for the perturbative approach is which vacuum to choose. This 
isn’t so strange: the dynamic role of the vacuum is also important in the Standard Model, 
where the vacuum is less symmetric than the Lagrangian, and this gives rise to the 
masses of particles, etc. Also, we know that perturbation theory is only an approximation 
(probably ill-defined) to the full quantum theory, where for instance we have quantum 
tunnelling between different vacua. To really understand the effective physics and thus 
make precise experimental predictions would require a truly nonperturbative treatment 
of string theory, and this is difficult (D-branes are our most reliable probe for this). In fact, 
when we have large numbers of strongly interacting strings, the string picture probably 
ceases as a good way of capturing the physics. But these issues, though important for 
physics, don’t concern Moonshine. 

Whether a believer, sceptic or agnostic, one must concede that string theory is truly 
remarkable. To Witten, physics without strings is like mathematics without complex 
numbers: just as the particle traces out a real curve (its world-line), the string traces out 
a complex curve (its world-sheet). Standard string theory books are [261], [463]. 


4.3.2 Informal conformal field theory 


A conformal field theory is a quantum field theory, usually on a two-dimensional 
space-time, whose symmetries include the conformal transformations. The first 
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two-dimensional CFT (the c = 1/2 free fermion) was constructed by Thirring in 1953. 
CFT really took off in the 1980s, starting with [50]. It arises in string theory, as well as 
the statistical mechanics describing certain phase transitions. Higher-dimensional CFT 
appears in the so-called AdS/CFT correspondence (see e.g. [5]). 

The relation between CFT and string theory is that CFT lives on the world-sheet £ 
traced by the strings as they evolve (colliding and separating) through time. Of course, 
a quantum string only collides and separates in the virtual sense of a Feynman diagram, 
and so CFT arises in perturbative string theory. More precisely, each term in the Feynman 
perturbation expansion of S-matrix entries in closed string theory will be a correlation 
function in a CFT living on the world-sheet. The world-sheets of these scattering strings 
have a boundary component for every incoming and outgoing string, as in Figure 4.11(b). 
Any such surface is conformally equivalent to a compact Riemann surface & with marked 
points pı, ..., pn (one for every incoming and outgoing string), as in Figure 4.13. For 
reasons we will explain shortly, we also require a choice of local coordinate z; for each 
pi — that is, an explicit identification of a neighbourhood of p; € X with one of 0 € C, so 
that z; = 0 is the coordinate for p;. We discuss the moduli space Men of these ‘enhanced 
surfaces’ in Section 2.1.4. 

This space-time & can be any conformal surface, and we identify conformally equiv- 
alent £. We restrict to compact orientable X}, although we don’t fix an orientation on 
it. Because of the string theory interpretation, it is tempting but incorrect to give each 
such © a Lorentzian metric (i.e. locally dt? — dx), but for compact © such a metric 
exists only for the torus. Instead, we give each & the usual Euclidean signature (i.e. 
locally dx? + dy? = dz dz) of Riemann surfaces. We think of the same CFT as living 
simultaneously on all such £. This leads inevitably to a moduli space formulation. 

The simplest indication why two dimensions are so special for CFT is that the space 
of local conformal transformations, which forms a Lie algebra isomorphic to so,,;;,;(R) 
in R” for n > 2, becomes infinite-dimensional in two dimensions. More precisely, if 
f(z) is any holomorphic map with nonzero derivative f’(zo) at some point zp € C, then 
f is conformal in a neighbourhood of zo (the converse is also true — see, for example, 
theorem 14.2 in [481]). Similarly, anti-holomorphic maps preserve the absolute value 
of angles but reverse the sign. This is essentially the statement that the Lie algebra of 
conformal Killing vector fields in R” is infinite-dimensional iff n = 2 (see chapter 1 
of [495] for a definition and proof); when n = 2 it contains two commuting copies of 
the Witt algebra Witt (1.4.9) (one copy for the holomorphic maps and one for the anti- 
holomorphic ones), arising as dense polynomial subalgebras in this conformal algebra. 
In our approach, this is how the Virasoro algebra arises. As mentioned in Section 3.1.2, 
n copies of Witt act on the enhanced moduli space Mons either by changing the local 
coordinate z;, moving the insertion point p; or changing the complex structure of X. 

The CFT literature is very sloppy when discussing the conformal group in two dimen- 
sions. In spite of numerous published claims to the contrary, it is not the conformal group 
of R? versus that of R” (n > 2) that singles out two dimensions. The conformal group 
is isomorphic to the finite-dimensional SO,,+1,:(R) in any R”. Although we can identify 
R? with C, and although holomorphic functions f are locally conformal (provided we 
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avoid the zeros of f’), these f don’t form a group. Although the conformal group of 
R? = C (or its compactification S°, if we permit poles) is finite-dimensional, the con- 
formal group of ‘Minkowski space’ R}! (or better, its compactification S! x S! — one 
S! for each null-direction x! + x”) is infinite-dimensional, and for S! x S! consists of 
two copies of Diff*(S!) x Diff*(S!), where Diff*(S!) isthe oriented diffeomorphism- 
group of the circle (Section 3.1.2). Thus its Lie algebra is Witt p Witt. If one wants an 
infinite-dimensional conformal group in CFT, one must put a Minkowski metric on the 
cylinder or plane. 

The subtle and poorly understood role of two dimensions for the conformal group is 
carefully discussed in [495]. Also interesting is how it arises in Segal’s picture (Sec- 
tion 4.4.1). For the interplay and representation theories of Witt, its central extension 
Vir and the real Lie group Difft(S!), see Section 3.1.2. 

On the cylindrical world-sheet in string theory, given a Minkowski metric, the standard 
light-cone coordinates would be t + x, where t is time and x is a periodic angle parameter. 
The solutions to the classical equations of motion on the cylinder would be functions 
of t +x (i.e. left- and right-moving disturbances travelling at the speed of light). As 
always, the Hamiltonian is proportional to the generator 0/dt of time translations. The 
Euclidean version (which is what we use) is w, w = t + ix, and so the left- and right- 
movers become holomorphic/anti-holomorphic functions of the cylindrical coordinate 
w = t — ix. Asis traditional but slightly disturbing, w and w are usually to be treated as 
independent complex variables; we will return to this subtle point shortly. By a formal 
application of the chain rule, the Hamiltonian in the Euclidean picture will be 


ð ðw da ðw ð 0 ð 
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In CFT, we prefer to use compact surfaces with marked points, so we should confor- 
mally map the semi-infinite tubes of the world-sheets to punctures on a compact surface. 
Locally, such a map looks like z = exp(w). This conformally maps our Euclidean cylin- 
der to the punctured plane C \ 0. Likewise, Z = exp(w) becomes to the right-moving 
coordinate. We can now write the Hamiltonian Witt generators £, = zrtlg: 

0 ð ð = 
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Basic data in the CFT are the quantum fields g(z, Z) — the vertex operators of last 
subsection — centred at z = Z = 0 on the Riemann sphere © = P'(C). The notation 
gy(z,Z) emphasises that these fields may depend neither holomorphically nor anti- 
holomorphically on z. These g are ‘operator-valued distributions’ on X, acting on the 
space H of states for the punctured plane (i.e. corresponding to a propagating string); 
as usual in quantum field theory, they create the various states by acting on the vacuum 
|0) € H. As usual, H comes with a Hermitian product, which allows us to compare 
lin) with Jout); in a physical theory it should be positive-definite (a theory without 
this positive-definiteness is called non-unitary). When we say (z, Z) is ‘centred at 0’, 
we mean that the matrix entry (u, (z, Z)v) will be a Laurent polynomial in the local 
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coordinates z and Z, for any u, v € H, with a singularity only at 0 (unless the outgoing 
state u isn’t the vacuum, in which case infinity can also be singular). 

In a CFT, anything that looks like a quantum field is called a quantum field. In the 
quantum field theories of Section 4.2, only the finitely many generating fields (e.g. the 
ones appearing in the Lagrangian) are usually called quantum fields. 

Any quantum field theory has a state-field correspondence: to a field g is associated 
its incoming state, that is the t — —oo limit of g|0). Typically, different fields can 
correspond to the same state. In CFT though, this correspondence becomes a bijection: 
to a given field (z, Z) on P'(C), we associate the state g(0, 0) |0) = v € H (recall that 
z =e'~'), Let p, denote the unique field corresponding to state v. 

As for any quantum field theory, solving a CFT requires calculating all n-point corre- 
lation functions (4.2.11): 


(Pv (Z1, Z1) Pv, (Z2, Z2): * “Po, Zas Zn)) Esp, EOR Pn? (4.3. 1a) 


for any choice of enhanced surface (£, p;, z;) and states v; € H. We think of py; (Zi, Zi) 
as being centred at p;; the local coordinates z;, Z; describe it as an “operator-valued 
distribution’ on © about p;. Simplest is the sphere £ = P! (C), because then we can fix 
a global variable w, and choose z; = w — p;. In this case the time-ordering of (4.2.11), 
necessary for convergence, becomes the radial-ordering 


[pil < [p2] < +--+ < |pal, (4.3.1b) 


because of our map e’~'*. The interpretation of n-point functions for other surfaces is 
more subtle and will be discussed shortly. 

The partition functions Zy (4.2.13b) correspond to vacuum-to-vacuum string ampli- 
tudes, and are functions on the moduli space of &. For example, a sphere is the world- 
sheet traced by a closed string spontaneously created from and then reabsorbed into the 
vacuum. As usual in quantum field theory, we can organise these amplitudes by how 
many internal ‘loops’ are involved (i.e. the genus of the surface): topologically, 0-loop 
(i.e. ‘tree-level’) world-sheets are spheres, 1-loop world-sheets are tori, etc. The 0-loop 
contribution isn’t very interesting (all spheres are conformally equivalent), but we’ll see 
shortly that the 1-loop partition function contains considerable information. 

Next we describe two general tools introduced by Kenneth Wilson in the 1960s (see 
e.g. [558]). The first is the operator product expansion (OPE). The idea is to replace the 
ill-defined product gı (x)p2(x) of quantum fields by 


oO 
pia) pax’) = D Cr — x") On(x), (4.3.2) 

n=0 
so the singularity structure as x’ —> x becomes manifest. The singular terms of (4.3.2) 
are physically the relevant ones. Here, the O,, are fields in the theory, and are express- 
ible as polynomials in the fields g; and their various derivatives. The coefficients C, 
are complex-valued functions with singularities of the form |x|~? (for p > 0) or log|x|, 
with the more singular coefficients C,, corresponding to simpler fields O,. Equation 
(4.3.2) is meant to hold for x’ close to x, in the weak sense of matrix entries, that is 
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correlation functions (4.3.1a). The significance of (4.3.2) to (4.3.1a) should be clear. A 
derivation and clarification of this fundamental concept (4.3.2) is made in (5.1.6), in the 
context of vertex operator algebras. The scalar quantum field theory in four dimensions, 
with $* interaction term, is worked out in detail in section 13-5-1 of [310], where we 
find for example that the only singular coefficient in the OPE of $(x)@(y) is propor- 
tional to log(x?). The reader may find helpful the discussion of OPE given in lecture 3 
of [567]. 

The OPE can be made more explicit here because CFT (unlike most theories) is 
scale-invariant, and this is Wilson’s second tool. We apply it separately to z and Z. Scale- 
invariance means we have a unitary representation s +> U (s) of the multiplicative group 
RŽ of positive real numbers, which is a symmetry of the Lagrangian; an eigenfield 
gy transforms by U(s)~'g(z, Z) U(s) = s”o(sz, Z) for some real number h (the ‘scal- 
ing dimension’ or conformal weight of o). Similarly, scaling Z yields an independent 
conformal weight A. Scale-invariance requires that the coefficient C, in (4.3.2) scales 
like 


C,(sz, 57) = s7 Tth g hA (2,7), 


where A(n) is the conformal weight of On. Since 


U(s)!8,9 U (s) = EE Z) = s's e Z) = s"*"(8,p)(sz,Z), 
Oz d(sz) 
the field 0,g has conformal weight h + 1. Thus the possible conformal weights of the 
fields O, lie in Nh, + Nho. This means that (4.3.2) involves only finitely many singular 
coefficients C,,. We see this more explicitly in (5.1.6). 

Recall that, classically, a continuous symmetry implies by Noether’s Theorem the exis- 
tence of a conserved current and conserved charges. In the case of the conformal symme- 
try of CFT, the conserved current is the stress—energy tensor, which has nonzero compo- 
nents T (z) := T.,(z) and T (Z) := T;(Z). The conserved charges L, := z $ T (z)z"—!dz 
satisfy 


T(z) = a |e mee (4.3.3) 


neZ 


(and similarly for L,).Ina quantum field theory, these arise in the Ward identities (4.2.12). 
Here these say, roughly, that taking a derivative of a correlation function (---)» with 
respect to a component of the metric on & is equivalent to inserting some component of 
T (z) into that correlation function. The OPE of the field T (z) with itself can be computed: 


T(z) T(z’) = 5 a2) tid +2(z -z2 T(V) +, (4.3.4) 


where we display only the singular terms. The number c is called the (holomorphic) 
central charge of the CFT. From this we obtain (see (5.1.6c)) the commutation relations 
for the modes L,, and we recover (3.1.5a). In other words, the modes L,, define a 
representation of the Virasoro algebra on H. Likewise, the modes L,, also define a 
representation of the Virasoro algebra (say with central charge c). These two copies of 
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Yir commute: [L,,, Lm] = 0. From the Hermitian product we get that c, € and all the 
conformal weights h are nonnegative real numbers. 

Thus, just as the usual quantum field theories (e.g. the Standard Model) carry projec- 
tive representations of the Poincaré algebra, a CFT carries a projective representation of 
its conformal algebra, that is, of two commuting copies of the Witt algebra. Hence we 
get the true representation of Bir @ Vir on H defined above. A nonzero central charge 
c (which is typical) amounts physically to a soft breaking of the conformal symmetry — 
an anomaly — caused by considering CFT on a surface with curvature. More precisely, 
the correlation functions (4.3.la) of a CFT will always be invariant under complex 
diffeomorphisms of the surface X, but in genus > 1 when c Æ 0 the correlation func- 
tions change under local rescalings of the metric. The central charge can be interpreted 
physically [3] as a Casimir (vacuum) energy, something which depends on space-time 
topology. 

As we have seen, everything in CFT comes in a combination of strictly holomorphic 
(left-moving) and strictly anti-holomorphic (right-moving) quantities. Here, ‘holomor- 
phic’ is in terms of the two-dimensional space-time € (which locally looks like C), or 
the local parameters on the appropriate moduli space (which usually locally looks like 
C). These holomorphic and anti-holomorphic building blocks are called chiral. A CFT 
is studied by first analysing its chiral parts, and then determining explicitly how they 
piece together to form the physical quantities. For the applications of CFT to Moonshine, 
the chiral parts and not the full CFT are what’s important. More generally, almost all 
attention in CFT by mathematicians has focused on the chiral data. 

Let V consist of all the holomorphic fields (z), and V the anti-holomorphic ones. For 
example, V contains T(z). Both V and V are closed under the OPE (4.3.2), and so form 
algebras called the chiral algebras of the theory. In the next chapter these algebras are 
axiomatised. V and V mutually commute and the symmetry algebra of the CFT is often 
identified with V @ V. However, the vacuum is not invariant under most of V @ V; we 
say this symmetry is ‘spontaneously broken’. Under the state-field correspondence, V 
and V correspond to subspaces V and V of the state space H. We call the quantum fields 
(z) € V (chiral) vertex operators. 

Since Lo acts like —zd-,, the scaling operator U (s) defined earlier is s~“°. The Virasoro 
operators Lo, L+; are special in that they generate the three-dimensional conformal group 
SL2(C) of the (Riemann) sphere. We have 


s*°,(z)s"° = s"9,(sz), (4.3.5a) 

et 9,(z)e! = pZ +x), (4.3.5b) 

eg (ze! = (1— xz) "9, ( E ) À (4.3.5c) 
l— xv 


for any v € V, provided Lov = hv (we say v has conformal weight A) and Liv = 0. 
Such states v are called conformal quasi-primaries. If in addition v satisfies L v = 0 
for all n > 0, then v is called a conformal primary state. They are precisely the lowest- 
weight states (Section 3.1.2) for the irreducible Yir-submodules of state-space H; H 
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will be the direct integral (Section 1.3.1) over all conformal primaries of the associated 
lowest-weight Wir-modules. Equations (4.3.5) are generalised in (5.3.15). 

More generally, the state-space H carries a representation of the symmetry algebra V © 
VY, and decomposes into a direct integral of irreducible V @ V-modules (proposition 3.1 of 
[187]). A rational conformal field theory (RCFT) is one whose state-space H decomposes 
into a finite sum 


H=OMEN, (4.3.6a) 


where M and N are irreducible modules of the chiral algebras V and V, respectively. One 
of the summands in (4.3.6a) is V @ V. The rational ones are the CFTs we are interested 
in; the name ‘rational’ was chosen because for them the central charge c and all conformal 
weights h are rational numbers. The chiral algebras of an RCFT will have only finitely 
many irreducible modules M; for later convenience let ® = (V) denote the set of these. 
The M € @ are called chiral primaries even though they don’t necessarily correspond 
to a unique vector in H. It is more convenient to write (4.3.6a) in the equivalent form 


H = Oucones Zm N M ® N, (4.3.6b) 


where Zm,n are multiplicities (many of which may be 0). It turns out (because V is max- 
imal) that Z will be a permutation matrix. This decomposition (4.3.6b) is reminiscent of 
the decomposition of a group algebra into irreducible modules. A beautiful interpretation 
in terms of Frobenius algebras in category theory is given in [211]. 

An important class of RCFT are the Wess—Zumino—Witten (WZW) models. These 
correspond to strings living on a compact Lie group G. Their mathematics is especially 
pretty, and any natural question seems to have an elegant Lie-theoretic answer. The chiral 
algebra V is closely related to the affine Kac-Moody algebra g” associated with G 
(Section 5.2.2); its modules M € ® can be identified with the integrable highest-weight 
modules L(A) at a level k determined by c and (3.2.9c). 

As with everything else in CFT, the correlation functions (4.3.1a) can be expressed in 
terms of purely chiral quantities called conformal or chiral blocks 


F= (Zi (v1, 21) Z2(v2, z2); + Dans Zn) (Dip yes Pas! ia M"): (4.3.7) 


Once again, & is a compact Riemann surface with marked points p;; to each point p; we 
assign a local coordinate z; as before, and also a choice of irreducible module M i eÈ. 
The state v; is taken from M’, and the fields Z; (v; , z;), centred at pi, are called intertwining 
operators and generalise the vertex operators g, € V. See Definition 6.1.9 (roughly, each 
TZ; (v;, Zi) is an operator-valued distribution sending vectors in some module to another). 
In the case of higher genus £, (4.3.7) cannot be taken too literally, and the study of 
higher-genus chiral blocks is more difficult [573], [296]; roughly, the points p; are first 
taken in the same coordinate patch of X; the function is then extended holomorphically. 
It will need branch-cuts in X to be well-defined. 
To solve a given RCFT, it suffices to: 


(a) construct all possible chiral blocks (4.3.7); and 
(b) reconstruct the correlation functions (4.3.1a) from those chiral blocks. 
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In its broad strokes, part (a) was explained in work of Moore—Seiberg [436] (and more 
carefully in [32]) — see Section 6.1.4. In deep work, Huang is pursuing the explicit 
solution to (a) for all sufficiently nice chiral algebras V (see e.g. [295] for the genus-0 
story and [296] for genus-1). Likewise, in a series of papers written by Fuchs, Schweigert 
and collaborators, topological field theories (Section 4.4.3) are used to find a solution to 
(b) (see the reviews [211], [496]). 

In CFT the Ward identities (4.2.12) are especially useful, since the symmetries are 
so considerable. For example, they imply that it suffices to evaluate the chiral blocks 
(4.3.7) when all v; are conformal primaries. Recall that Witt acts on moduli spaces (Sec- 
tion 3.1.2); this lifts to one of Yir on chiral blocks, and the resulting partial differential 
equations are the KZ equations of Section 3.2.4. Their monodromy is what makes the 
chiral blocks so interesting, especially to Moonshine. 

The most important example of chiral block is for the torus C/(Z + tZ) with one 
marked point (it doesn’t matter where), assigned V-module M! = V and state vı = |0). 
Taking any operator Z; intertwining some M € ®(V) with itself, the corresponding chiral 
block (up to a constant multiple) will be the graded dimension 


Xu (t) = trye t=, (4.3.8) 


where c is the central charge and Lo is the Virasoro generator corresponding to energy. 
We explain in Section 5.3.4 how this arises. Using (4.3.6), the 0-point correlation function 
for the torus — the 1-loop partition function Z — becomes 


Z2(t,D):= trye?™ lt Loc /24)—t Lo—e/24)] = 5 Zyw xu(t) xXW(t). (4.3.8b) 
Me®,Ne® 


This is a very typical decomposition of a physical correlation function into chiral blocks. 

The reviews [496], [216] provide careful explanations of why sometimes we treat z 
and Z as independent, and other times we must treat one as the complex conjugate of the 
other. In short, from the point of view of chiral data, the single space-time & of the full 
CFT is really two disjoint copies with opposite orientation (the Schottky double). For 
example, the torus with modular parameter t € H is paired with the one with parameter 
—T € H. As in (4.3.8b), the correlation functions of the full CFT involve both modular 
parameters, but at the chiral level the two tori don’t see each other. 

In particular, for a given choice (£; {p;}; {M‘}), an RCFT assigns a finite-dimensional 
space BF a «wi Of chiral blocks. Each chiral block depends multi-linearly on the v; € M a 
and meromorphically on the z;, though branch-cuts in & between p; will be needed. The 
dimension of this space Be Mi) is called the Verlinde dimension, and is given by 
Verlinde’s formula (6.1.2) below. 

For example, consider a WZW model associated with an affine algebra g = g™ and 
level k € N. Fix an extended surface (£, p;, z;). We have a copy of g at each p;, built 
in the usual way (Section 3.2.2) from the loop algebra g ® Cir). The chiral primaries 
M e © are the integrable highest weights A € P(g); to each point p; choose some 
© P(g). The associated space B of chiral blocks is constructed in [530], and these 
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have an important geometric interpretation as spaces of generalised theta functions (see 
chapter 10 of [495]). 

The affine algebra characters x, of (3.2.9a), as well as the j-function (0.1.8) are 
examples of chiral blocks. As we see next subsection, the spaces Be ei Mi naturally 
carry a representation of the mapping class group iver and this is the source of the 
relation of the braid group to subfactors, as well as the modularity of Moonshine. In 


particular, the RCFT characters (4.3.8a) transform nicely under SL,(Z): for example, 


xu(—1/t) = }_ Sun xw(), (4.3.9a) 
Ne® 

xut +1) = J` Twn xv), (4.3.9) 
Ne® 


where S, T are finite complex matrices. This T matrix is given by 
Ty y = e% Ous y, (4.3.10) 


where Ay is a real number (called the conformal weight) associated with the chiral pri- 
mary M € ©. The matrix S is, however, more complicated (Section 6.1.2). For example, 
the matrix T for the WZW models involves the quadratic Casimir of g, while the matrix 
S involves characters of G evaluated at elements of finite order. 

The simplest class of RCFT are the minimal models, which have the smallest possible 
chiral algebra (generated only by the identity field and the stress—energy field T (z)) and 
nevertheless still have a finite decomposition (4.3.6a). They are well understood (see e.g. 
[131]).They are the RCFT with central charge 0 < c < 1, and correspond to the discrete 
series (3.1.6) of Vir. 

The smallest nontrivial minimal model is the Ising model. It has central charge c = 0.5. 
The associated chiral algebra has three irreducible modules, which we label ® = {0, €, o} 
as in [131]. Their graded dimensions (4.3.8a) are 


xolt) =q" (A +q? +q? +244 + 29° +3q° +3q' +--+), (4.3.11a) 
x(t) =q” (1 +q +q? tq? +24 +245 +3qf +347 +--+),  (4.3.11b) 
Xo(T) = q4” (0 +q +q? +20? +244 439° +4qo4+5q7 +--+),  (4.3.11c) 


where as always q = e?'*. From this we can read off the conformal weights ho = 
0, he = 1/2, ho = 1/16, and hence the T matrix of (4.3.9b): 


e7i/24 0 0 
T= 0 esmi/4 0 |. (4.3.11d) 
0 0 eti/12 


The matrix S is more difficult to find, but it equals 


—/2 |. (4.3.1 le) 
0 
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Fig. 4.14 The moduli space of conformal field theories with central charge c = 1. 


The 1-loop partition function Z(t) of (4.3.8b) is 
Z(t) = X00)? + Ix)? + Ixo (OI. 


The CFT corresponding to open string perturbation — boundary CFT — is also interest- 
ing (see e.g. the review [461]). In this direction, see the proposals in [215], [458] (building 
on the a-induction of subfactors [65]). For instance, the 1-loop partition function cor- 
responds to a Frobenius algebra (Section 4.4.3) in the modular category of modules of 
the associated chiral algebra, and the boundary CFT data arise as a ‘category module’. 
However, boundary CFT isn’t so relevant for Moonshine and will mostly be ignored in 
this book. 

The space of CFTs can be probed using ‘marginal operators’ — fields p, with con- 
formal weight (h, h)=(,1) obeying certain other properties (see e.g. [137] and [246] 
section 8.6). A given CFT can be deformed (changing its spectrum but not central charge 
c), provided it contains such a field. If the given CFT has n marginal operators, then the 
space of CFTs in its neighbourhood is expected (typically) to look like an n-dimensional 
real manifold. When the given CFT has more marginal operators than the neighbouring 
ones, the space of CFTs at that point may look like two manifolds intersecting trans- 
versely, or it can mean an orbifold singularity where you get different realisations for the 
same CFTs. The RCFTs are special points in this space. The space of known c = 1 CFTs 
is drawn in Figure 4.14. Points on the horizontal and vertical lines are parametrised by 
a radius Oe < orb, Fe < ©; these two half-lines intersect at rop = 1/ YD: te= V2. 
The known rational c = 1 CFT consists of the three isolated theories T(etrahedral), 
O(ctahedral) and I(cosahedral), together with those theories with Pos €Qor r2 EQ. 
The fourth isolated point, RW, is irrational and described in [483]. Theories with radii 
re and r} = 1/(2r-) are equivalent, as are those with radii rors andr, = 1/(2rorp) (this 
is an example of “T-duality’, and arises from the extra marginal operator possessed by 
the re = ae and ro-p = a theories). The intersection point also has two, while the 
isolated points have no marginal operators, and the remainder have one (which permits r 
to be continuously varied). The moduli space for CFT with central charge c < 1 consists 
of countably many isolated points [91]. Very little is known about the moduli space for 
c>l. 
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4.3.3 Monodromy in CFT 


One way to make conformal symmetry manifest is to make the relevant physical quan- 
tities be holomorphic functions of (or more precisely, sections of bundles over) the 
appropriate moduli spaces. Let V be the chiral algebra of an RCFT and let ® label its 
(finitely many) irreducible modules, that is the chiral primaries. Let 0 denote the one 
corresponding to the subspace V of H. Let’s investigate more closely what chiral blocks 
(4.3.7) are. 

In any RCFT, there are differential equations that the chiral blocks must satisfy. The 
most well known of these are the Knizhnik-Zamolodchikov (or KZ) equations. We studied 
these for WZW models at genus 0 in Section 3.2.4. Good expositions of this material are 
given in [355], [207], [186]. Differential equations can also be found using null vectors 
[50], and using the Ward identities. 

Return to the Ising model, introduced last subsection. We know its chiral blocks in 
genus-O with two or three marked points (Question 4.3.5). Consider now four marked 
points on the Riemann sphere, at positions w; € C U {oo}. The chiral block will be the 
product of the quantity 


[| Cvi- wp ts Erh (4.3.12a) 
1<i<j<4 
with some function of the cross-ratio 
„_ (wi — w2)(w3 — w4) 
~ (wi — w3)(w2 — wa)” 
We can simplify this using the Möbius symmetry of the Riemann sphere to move w; to 


0, w, 1, œœ, respectively. If we label all four marked points with the primary field o € ®, 
then the space of chiral blocks is two-dimensional, spanned by 


1+Jl—w 


(4.3.12b) 


Pie et Ne 43.13 
w= R (4.3.13a) 
piss (4.3.13b) 


V2 (wil — w)! 
The fractional powers tell us these chiral blocks have branch-point singularities — that 
is, to get a holomorphic function on the w-plane, we need to make semi-infinite cuts. 
Nevertheless, we can analytically continue these functions along any curve. Take a point 
Wo so that O < |wo| < 1, and consider the circle w(t) = wo et forO<t <1. Nothing 
special happens to the numerator of the F;(w): its values at t = 0 and t = 1 are equal. 
The denominator however picks up a factor e?7'/8, and thus both blocks F;(w) pick up 
a net factor of e~27'/8. We call this the monodromy about w = 0 (Section 3.2.4). 
Consider next their monodromy about w = 1. Here our circle will be w(t) = 1+ 
woe", again for wọ small. Note that the numerators of Fı and F switch, and the 
denominators again pick up a factor of e?7i/8, Thus this monodromy can be written 


Fiw) 0 e BN / Fw) 
Pw) (e78 o Fo(w) j) 
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In Section 3.2.4 we explain how to think of this. Reintroducing the four coordinates w;, 
the chiral blocks F; will be holomorphic on the universal cover C4 of the configuration 
space C4 of (1.2.6). Analytically continuing along any closed path y in €4 (across any 
of those branch cuts) defines an action of the fundamental group 7r(€4) on the space 
$8.4 of chiral blocks. This group 7; (€4) is the pure braid group of the sphere with four 
strands. An element £ of the full braid group of the sphere maps the space BO 2 m3 må 
to Bo B2 B3 mba» where fi is the associated permutation, so in our example (4.3.13) 
the full braid group acts. We can recover the usual planar braid groups P3 and 63 here 
by fixing one of the four points at say oo, and letting the others wander around. 

Equivalently, as a ‘function’ on the configuration space, the chiral blocks form (multi- 
valued) holomorphic sections of a projective flat vector bundle. What this means is that 
each chiral block satisfies a system of partial differential equations (the KZ equations) 
describing how to parallel-transport it around the configuration space, and flatness says 
it will /ocally depend only on the moduli space parameters (and not on the path chosen). 
Globally, however, there will be monodromy [437], [32], [355]. 

More generally, a chiral block F on an enhanced surface & is a multi-valued function 
on the corresponding moduli space. To make it well defined, F can be lifted to the corre- 
sponding Teichmiiller space. There will be an action of the corresponding mapping class 
group Pen coming from monodromy (a projective action, if as usual the central charge 
c is nonzero). How to centrally extend these Fon so that the projective representation 
becomes a true one is discussed, for example, in [404]. This picture, which is explained 
quite clearly in [32] and is developed further in, for example, Section 7.2.4, encompasses 
not only the braid group monodromy of the KZ equation (Section 3.2.4) but also the 
modular group action (4.3.9) on the graded dimensions (4.3.8a). It is the source of the 
modularity in Moonshine. 

Although the chiral blocks themselves are multi-valued functions on the moduli spaces 
Pien conformal invariance requires that the n-point correlation functions (4.3.1) them- 
selves be well-defined functions on My n- For example, even though the graded dimen- 
sions xy transform as in (4.3.9), the 1-loop partition function in (4.3.8b) is SL(Z)- 
invariant. See also Question 4.3.7. 

As we know from Section 2.2.1, there is more to being a modular form or function 
than transforming nicely with respect to SL2(Z). The behaviour at the cusps of H is 
also crucial, as it says our function lives on a compact space. Something similar also 
holds in RCFT. The analogue of cusps for the other moduli spaces — that is, the surfaces 
corresponding to the extra points needed for compactification — are surfaces with nodes 
(Section 2.1.4). What we need is nice behaviour of chiral blocks as we move in moduli 
space towards surfaces with nodes, that is, as we shrink a closed curve about a handle 
on our surface down to zero radius. This is given by (4.4.3) and is called factorisation 
[203], [539]. It connects the moduli spaces of different topologies, and tells us CFT is 
defined on a ‘universal tower’ of moduli spaces (Sections 3.1.2 and 6.3.3). 

Incidentally, it is tempting to try to extend this formalism to the ‘surfaces of infinite 
genus’ given by projective limits lim_I"\H (see Section 2.4.1). The discrete groups I" 
appearing in each such limit must all be commensurable (i.e. intersections of any two of 
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them should have finite index in both), in order for the limit to be defined. In Section 2.4.1 
we describe the most famous piece of such a limit: the modular tower lim_I(V)\H, 
so important to number theory. The assignment of, for example, chiral blocks to such 
‘surfaces’ may be built up from those of each \H, in a relatively straightforward way; 
because of this, perhaps we could interpret the string-theoretic data for lim—I"\H as 
the (nonperturbative?) contribution (‘sum’) associated collectively with all world-sheets 
T\H appearing in that limit. In any case, we are led to speculate from (2.4.3) that both 
CFT and the theory of vertex operator algebras (and indeed Moonshine itself) may extend 
quite nicely to the p-adics Q p- Some moves in this direction are [562], [520]. To a number 
theorist, the usual perturbation about a vacuum would correspond to the infinite prime, 
but would mysteriously ignore the contributions from all the finite primes. It would be 
interesting to see if nonperturbative phenomena like D-branes can be sensed by these 
projective limits. 

As discussed at the end of Section 2.2.1, the analogue of q-expansions, for chiral 
blocks and partition functions in higher genus, are expansions about surfaces with 
nodes. A natural projectively flat connection on these spaces 8%”) of chiral blocks 
is given by the stress—energy tensor T(z) [203], [530]; this connection is responsi- 
ble for the KZ equations, and is the analogue here of the Witt action on moduli 
spaces, and the meaning of T(z) insertions into correlation functions discussed in 
Section 4.3.2. 


4.3.4 Twisted #4: the orbifold construction 
To particles, a space-time singularity is a problem; to strings, it is merely a region where 
stringy effects are large. The most tractable way to introduce such singularities is by 
quotienting (‘gauging’) by a finite group. This construction plays a fundamental role 
for CFTs and vertex operator algebras; it is the physics underlying what Norton calls 
generalised Moonshine (Section 7.3.2). This is where finite group theory touches CFT. 

Let M be a manifold and G a finite group of symmetries of M . The set M / G of G-orbits 
inherits a topology from M, and forms a manifold-like space called an orbifold. Fixed 
points become conical singularities. For example, {+1} acts on M = R by multiplication. 
The orbifold R/{+1} can be identified with the interval x > 0. The fixed point at x = 0 
becomes a singular point on the orbifold, that is, a point where locally the orbifold does 
not look like some open n-ball (open interval in this one-dimensional case). For other 
examples, see Question 4.3.8. 

Orbifolds were introduced into geometry in the 1950s as spaces with mild singularities; 
recalling Definition 1.2.3, they are Va /G« patched together, where Vy C R” is open and 
G, is a finite group. They were introduced into string theory in [143], which greatly 
increased the class of background space-times in which the string could live and still be 
amenable to calculation. This subsection briefly sketches the corresponding construction 
for CFT; our purpose is to motivate Section 5.3.6. 

For concreteness think of a closed string whose world-sheet & C M is a torus, since 
the 1-loop partition function (4.3.8b) is the easiest way to obtain the spectrum (4.3.6) 
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of the theory. Think of © being parametrised by z € C/(tZ + 272Z), with t being the 
time-period of the 1-loop and 27 being the space-period of the closed string. Here, G 
is a finite group of symmetries of the theory — it acts not only on space-time M, but 
also on the internal states of the string (i.e. the state-space carries a representation 
of G). Assume for now that G is abelian and that H = V @ V. For example, this is 
satisfied by the WZW theory for E‘! at level 1, or strings living on the torus R” /L for 
an even self-dual n-dimensional lattice L. Consider first the chiral data. The orbifold 
chiral algebra V°’? is the subalgebra V° of V consisting of all G-invariant fields. More 
difficult to answer is what the orbifold state-space H°”” looks like. 

In the case of a point particle, a 1-loop world-line x(t) € M/G would be a circle, the 
motion x(t) would be periodic (say with period T ); lifting x(t) to M , we would require that 
x(T) = g.x(0) for some g € G. The closed string also requires this twisted periodicity 
in the time direction, but being closed it will similarly have a twisted periodicity in the 
space direction. Thus we are led to consider string processes satisfying the boundary 
conditions 


x(z + T) = g.x(z), x(z + 277) = h.x(z). (4.3.14a) 


The strings satisfying x(27r) = h.x(0) form the h-twisted sector V” — these twisted sectors 
are the special feature of strings living on orbifolds. They don’t live in the original 
chiral space V, and are hard to construct; in particular, there isn’t a systematic twisted 
analogue of the vertex operator construction (i.e. exponentials of free fields) of untwisted 
sectors. 

The contribution of the processes (4.3.14a) to the 1-loop path integral will be 


Zeg, (T) = tryna g eT Lo-c/4), (4.3.14b) 


for reasons that will become clearer next section (the trace comes from obtaining the 
torus by sewing together the inner and outer boundaries of an annulus). Each (finite- 
dimensional) Lo-eigenspace in V"carries a representation of the group (g), so that is 
the matrix to substitute into the trace (4.3.14b). The modular group SL2(Z) acts on the 
cycles (homology H1) of the torus in the usual way, which gives the behaviour of Zn) 
under modular transformations: 
at+b 
Z(g,h) (==) = Z(gane,ghna(T). (4.3.14c) 

Actually, we will find shortly that in general this transformation has to be modified 
slightly. 

The twisted sector V" is an irreducible (twisted) module for the original chiral algebra 
V (Section 5.3.6). In terms of the orbifold chiral algebra V°, V” will be a true module, 
though not an irreducible one. Its decomposition (“branching rules’) into irreducible 
VS -modules is 


V" = ,V! & p, (4.3.15a) 


where the sum is over all irreducible G-representations p (when G is non-abelian, this 
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will be modified slightly). Plugging this into (4.3.14b) gives the equivalent expressions 
Zem(t) = $ chp(g) Xap (T), (4.3.15b) 
p 


Xho (T) = trype?™i? COP — an 2 me) Zen (T). (43.150) 
The graded dimension xij), unlike Z, n), has a g-expansion with coefficients in N, 
but Zg, n) has the simpler modular behaviour, in perfect analogy to ©,,; versus OŁ; s 
(compare (2.2.11) and (2.3.10)). 

An important example of this orbifold construction is the Moonshine module V’ 
(Sections 5.3.6 and 7.2.1). Its starting point is the chiral algebra V(A) for the torus R*/A, 
where A is the Leech lattice. The symmetry group G corresponds to the centre {+1} of 
Aut(A). The graded dimension of the untwisted sector V(A) is 2(1,1)(t) = J (tT) + 24, 
and has —1-twisted graded dimension 


[o0] 
Zont) =q [ [0 -q+ = q7! — 24 + 276q — 20484? + --- 
n=0 


The —1-twisted sector V(A)~! has untwisted/twisted graded dimension 
CO 

Za) = 212g!/2 I] (1 = ee 
n=0 


= q1? + 98304q + 12288004? + 10747904q? + --- 


The Moonshine module V* consists of the sectors VAL ($> V(A);! and so has graded 
dimension 


Xv: (T) = Xa H(t) + Xoi, (T) 
1 
= (Za, nT) + Zeit) + Za,-v(t) — Zei, -9(1)) = J (©). (4.3.16) 


So far we have discussed only the chiral orbifold CFT — our main interest. The state- 
space (4.3.6) of the full orbifold CFT can look like 


H? =V 8V). (4.3.17) 


There are other possibilities for H?"’; a systematic but far from exhaustive source is 
provided by discrete torsion [136]. The lattice construction L {T } of Section 2.3.3 (applied 
to indefinite lattices L) is this orbifold construction of H°”?, coming largely from discrete 
torsion. The construction of V? is a heterotic version (i.e. with trivial ‘anti-holomorphic’ 
chiral algebra V). In any case, the full orbifold theory will typically involve most sectors 
V5. Modular invariance (4.3.14c) is one way to see the necessity of this; another is string 
dynamics (see figure 8.1 in [463], vol. I). 

There are three significant generalisations of this orbifold construction as outlined 
above. Non-abelian orbifold groups G are at least as interesting to us (e.g. Maxi- 
Moonshine concerns V 1/M), and introduce new subtleties. For example, using (4.3.14a) 
to evaluate x((z + T) + 27) = x((z + 277) + t) requires hg.x(z) = gh.x(z). That is, we 
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should limit ourselves to boundary conditions (4.3.14a) whose pairs (g, h) commute. 
Moreover, consider the h-twisted sector x(27) = h.x(0); hitting both sides with g € G 
yields (gx)(2) = (ghg~').(gx)(0), that is, the twisted sectors V” and vshs are nat- 
urally isomorphic. In fact, Zeg, n) = Zgk-!,knk-!) for any k € G, so we should identify 
each boundary condition (g, h) with all simultaneous conjugations (kgk—!, khk~'). This 
will be clearer in Sections 5.3.6 and 6.2.4. The sums in (4.2.15) are over all g € Cg(h) 
and all irreducible Cg(h)-representations p, where Cg(h) is the centraliser of h in G. 

For the second generalisation, note that g € Cg(/) takes the sector V” to V8" = yh 
so (as in Section 1.5.4) we get a linear map oy” : V! —> V". So far we have implicitly 
assumed that these assignments g b> oy define a representation of Cg(h). But V” are 
chiral data and so group actions, etc. may be projective. That is, we only know that 
gr ow defines a projective representation of Cg(h). In this case, (4.3.14c) must be 
replaced by 


at+b 
Zeh) (==) = Y Z(gane,gondy(T), (4.3.18) 


for some root of unity y. See [138], and Section 5.3.6 below, for details. For example, 
the Maxi-Moonshine orbifold V*/M will necessarily be of that projective type [408]. 

For the final generalisation, we have discussed orbifolding the CFTs with one chiral 
primary (i.e. with||®|| = 1) only because they are simpler. The behaviour of more typical 
multi-primary orbifolds is analogous (Section 5.3.6). For example, the horizontal line 
of c = | CFTs in Figure 4.14 corresponds to bosons compactified on a circle of radius 
r, while the vertical line there corresponds to bosons on the orbifold $'/Z» (see the 
treatment in [246]); most of these theories have infinitely many chiral primaries (i.e. 
aren’t rational). The WZW theory for A," at level 1 is ac = 1 theory with two chiral 
primaries corresponding to a string living on $°; we can orbifold this rational theory 
by any of the finite subgroups of SU2(C). These subgroups fall into an A-D-E pattern 
(Section 2.5.2). Orbifolding by the (cyclic) A-series of subgroups gives the c = 1 theories 
reo =n/ /2, and by the (dihedral) D-series gives the c = 1 theories r,,, = n ppd: The 
(tetrahedral) E'-, (octahedral) E7- and (icosahedral) E-subgroups give us the isolated 
theories T, O, I of Figure 4.14. 

Choose any CFT #H and tensor it with itself n times to get a new CFT HS". The 
orbifold HS” /S, is called a permutation orbifold. Requiring that H®”"/S, possesses the 
standard CFT properties imposes highly nontrivial conditions on the chiral data of H. 
See, for example, [37] for applications of this powerful theoretical tool. 


4.3.5 Braided #4: the braid group in quantum field theory 


Much of Moonshine is implicit in two-dimensional CFT. What is the most distinctive 
physical feature of two-dimensional quantum field theory? 

In three or more dimensions, the rotation group SO,,(R) is non-abelian. We know 
everything about the finite-dimensional unitary projective representations of this simple 
Lie group: there are countably many, namely the highest-weight representations of its 
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universal cover Spin, (IR). Physically, we know these fall into two families (‘superse- 
lection sectors’), depending on what happens after a rotation by 27: the true represen- 
tations of SO,(R) (the ‘integer-spin’ bosons) and those that are merely projective (the 
‘half-integer spin’ fermions). 

In two dimensions, this familiar picture collapses, as the rotation group SO2(R) is 
isomorphic to S! and has universal cover R. The unitary representations are parametrised 
by the ‘unitary duals’ S$ Si = ZandR = R, respectively. In particular, the element x € R 
is sent to the 1 x 1 matrix e?7'* for ‘spin’ a € R = R. The behaviour (monodromy) of 
these representations under rotations by 27 again determines the physics, and instead of 
the boson/fermion alternative, we get superselection sectors parametrised by R /S eens 

The different physics of bosons and fermions is revealed by the spin-statistics rela- 
tion. Define as in (1.2.6) the configuration space €, (Rf) of n distinct points x in R4, 
consisting of n copies of R with all diagonals x“ = x) deleted. We are interested in 
these describing the positions of n identical particles, so for each permutation o € S, 
identify (x, ...,x™) € €, (R1) with («,..., x”). A closed loop in €, (R/S, 
corresponds to an explicit rearrangement of the n particles. It is important to note that, 
for any n, d, the space of trajectories will be disconnected. In Feynman’s formalism, 
this means we have the freedom to introduce relative factors between the corresponding 
disjoint path integrals. By unitarity these factors should be phases (complex numbers of 
modulus 1), and consistency requires them to define a representation of the fundamental 
group 71 (€, (R2)). For d > 2 this fundamental group is the symmetric group S,,, and so 
there are only two possible choices for these relative phases, corresponding to the two 
one-dimensional representations of S,,: all +1’s, or det(o). The spin-statistics theorem 
[518] tells us that +1 corresponds to bosons and det(o) to fermions. 

In two dimensions, the fundamental group is the braid group B,, and its one- 
dimensional unitary representations are parametrised by t € R/Z and defined by o; > 
e*'' This t parametrises the different consistent assignments of phases to the disjoint 
integrals in the Feynman expressions. Again, the spin-statistics theorem relates this 
phase assignment to spin: this ¢ is the same as the spin œ (mod 1). This is called braid 
statistics for obvious reasons. Such particles are called plektons (after the Greek word 
for ‘braid’) or anyons (since they can have any spin). 

One-dimensional representations of S, or B, are the simplest. Higher-dimensional 
representations would indicate an internal structure and are considered in, for example, 
parastatistics. In Section 4.3.3 we see how higher-dimensional representations arise in a 
similar way in CFT. See, for example, [204], [191] for some general treatments of braid 
statistics in CFT. Possible physical realisations of braid statistics are reviewed in [557], 
[345]. In particular, subjecting certain semiconductors to large magnetic fields and cold 
temperatures yields the so-called fractional quantum Hall effect, and its quasi-particles 
provide an actual realisation of anyons. Since braid statistics is a topological effect, 
it is intimately related to the Aharanov-Bohm effect (a notorious topological effect in 
quantum theories). 

So two dimensions are special for quantum field theory. We know four dimensions are 
special in differential geometry [195]. For example, in any R” all differential structures 
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are equivalent, except n = 4 where there are uncountably many inequivalent ones 
(Section 1.2.2). Are those two dimensions related to these four dimensions, and are 
they related to the apparent four-dimensionality of macroscopic space-time? This isn’t 
clear to this author. 

The possibility of braid statistics arises in two dimensions because the space-like 
vectors in two-dimensional space-time are disconnected. The other special features of 
two dimensions are all related to this. As we discuss in Section 4.3.2, the space of local 
conformal transformations is finite-dimensional in n dimensions, except forn = 2 where 
it is infinite. The light-cone minus the origin is also disconnected in two dimensions, and 
this implies the existence of infinitely many conserved currents. 

What makes four dimensions special in differential geometry is the behaviour of 
embedded 2-discs (many proofs in n dimensions are based on understanding that 
behaviour). A generic map of a disc into an n-manifold has self-intersections that are one- 
dimensional if n = 3, which consist of isolated points if n = 4, and are non-existent if 
n > 5. Also, the Seiberg-Witten equations (so useful for studying 4-manifolds) exploit 
the fact that the rotation algebra 504 = s03 ® $03 (corresponding to a group SO,(R) 
homeomorphic to S 3 x P3(R)) is nonsimple, while in all other dimensions n > 2 so, is 
simple. 


Question 4.3.1. (a) Consider the free scalar theory in d dimensions, given by Lagrangian 
L= -5 >, 9ub 0%. Assuming scale-invariance of £, deduce the scaling dimension 
of ¢. 

(b) This theory is massless. What happens when the mass term is introduced? 


Question 4.3.2. Prove that when p + q # 2, the infinitesimal conformal symmetries 
of R?? form a finite-dimensional Lie algebra, but that it is infinite-dimensional when 
p +q = 2. (That is, write x“ > x” + e” (x), we’re interested in those infinitesimal €” 
for which the metric ds? goes to a multiple of itself.) 


Question 4.3.3. Let X be a Riemann surface of genus g with n discs removed. Suppose 
it is dissected into N ‘pairs-of-pants’ (i.e. spheres with three discs removed). Prove that 
this dissection is possible only if n + 2g > 2, in which case N = n + 2g — 2. 


Question 4.3.4. Assuming (4.3.5) and the state-field correspondence, prove Liv = 0 and 
Lov = hv. 


Question 4.3.5. Suppose Lv; = 0 and Lov; = hivi. Compute the chiral blocks 


Cola =z" if hi = h 
0 otherwise ° 


C123 
|Z) — zo| +h] z3 = z3 |ł2th3—=hi Z1 — Z3 hı+h3—h2 


(Poi (21) Pv Z1)) = | 


(Pui (21) P22) Pr; (Z3)) = 


for constants C12, C123, using (4.3.5). 


Question 4.3.6. Describe the monodromy (if any) about w = œ of the chiral blocks in 
(4.3.13). 
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Fig. 4.15 A morphism È : Cı > Co. 
Question 4.3.7. Find the sesquilinear combinations >> i=1,2 Ci jFi(w) Fi(w) of the chiral 
blocks in (4.3.13), which are invariant under the various monodromies. (The physical 
correlation functions will be of that form.) 


Question 4.3.8. Describe the following orbifolds: (a) (R/Z)/{+1}; (b) (C/(Zt + 
Z))/{=1}; (© (C/(Z + iZ) \ A)/Zs, where A is the diagonal x + ix, and Z, acts by 
identifying (x, y) and (y, x). 


4.4 Mathematical formulations of conformal field theory 


In Sections 4.3.2 and 4.3.3 we gave a quick standard sketch of the basics of CFT, 
introducing the reader to the main notions. In this section, as well as Chapter 5 and Sec- 
tion 6.1, we explore certain aspects of CFT more carefully, clarifying them considerably. 
Surprisingly, many of these aspects are fundamental to Moonshine. 


4.4.1 Categories 


A deeply influential formulation of CFT is due to Graeme Segal [500], [502], [498]; see 
also [241]. It is motivated by string theory (Section 4.3.1) and is phrased using category 
theory (Section 1.6.1). According to Segal, a CFT is a functor S from a category C of 
Riemann surfaces (the world-sheets) to the category Hilb of Hilbert spaces (the state- 
spaces). 

The objects of category C are finite disjoint unions C, of n circles, for all n > 0. We 
fix a parametrisation on these circles — that is, a smooth identification t of each circle C 
with R/Z; this induces an orientation on C. A morphism Cm —> C, is a (not necessarily 
connected) Riemann surface X with boundary 0% consisting of m + n parametrised 
circles; exactly n of those boundary circles come with parametrisations consistent with 
the orientation of & induced from its complex structure. We think of these n as ‘outgoing’ 
strings and the remaining m as ‘incoming’ ones. For example, in Figure 4.15 the solid 
circles are outgoing and the dashed one is incoming. We identify two such morphisms 
È : Cm —> Cn, X’ : Cm —> Cn if there is a conformal map f : E —> X’ such that the 
parametrisations ¢; and tf o f of the boundaries 0X and 0X’ agree. 

The space Hom(Cm, Cn) is topological, with a connected component Cy for each 
homeomorphism class [£ ] of (not necessarily connected) surfaces with boundary having 
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Fig. 4.16 An example of sewing. 


m +n components. For example, Hom(Cp, Co) has one component for every choice of 
no Spheres, n; tori, ..., 2, compact genus-g surfaces, ..., provided Èy Ng < ©. 

Finally, the composition X’o X of morphisms © : Cm > Cn, X’: C, > Cp is 
obtained by sewing together the surfaces £ and X’ along the circles in C, by using the 
parametrisation to identify corresponding points on the boundaries. In fact, this sewing 
construction is the main reason we require these boundary circles to be parametrised. 

Recalling Definition 2.1.6, this space Cs can be regarded as the quotient of the space 
of complex structures on X, by the group of all diffeomorphisms of £ that are the 
identity on the boundary ə £. Thus, Cy is an infinite-dimensional moduli space. Write 
C,; for the component of Hom(C,,, Cn) corresponding to connected genus-g surfaces 
(with k = m + n punctures) — this is the most interesting part of Cy. Recall the enhanced 
moduli space Pty defined in Section 2.1.4; provided only that k > 0, Cg, is a finite- 
dimensional complex manifold, unlike Mok, and can be expressed as a bundle over My, k 
with infinite-dimensional fibre (page 453 of [502]). The mapping class group for Cx is 
the Lg of Section 2.1.4, that is an extension of I’, , by k copies of Z. 

The most important space is that Co,2 of annuli. We get the easy homeomorphism 


Co.2 = (0, 1) x (Diff*(S!) x Difft(S!))/S!. (4.4.1) 


The interval (0,1) arises because any annulus is diffeomorphic tor < |z| < 1 for some 
0 <r < 1. The two copies of Difft(S!) correspond to reparametrisations of the two 
boundary circles — this is where the two copies Ln, Lm of Wit arise. We factor out by S 1 
since rotations are the only holomorphic automorphisms ofr < |z| < 1. 

A CFT is (among other things) a projective representation of category C: to each object 
Cn we assign a vector space S(C,,), and to each morphism È : Cm —> C, a linear map 
S : S(Cm) > S(Cn), such that for any objects Cm, Cn, Cp and morphisms ©’ : Cm > 
Cn, © : Cy — Cp, we obtain the functorial sewing axiom 


S(D 0D’) = c(E, ENSE) o S(D’) (4.4.2) 


for some nonzero c(X, X’) € C. More precisely, S(C,,) is the tensor product H @ --- 
Q H =: H® of the state-space H of our CFT, and H® := C. Here, H is something like 
the space L*(£M) of wave-functions on the loop-space LM := {f : s! > M}, where 
M is the space-time in which the string lives. Convergence in the Figure 4.16 sewing 
operation described below requires the operator S(X) to be trace class. 
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The idea is for S(X) to mimic the Feynman path integral (4.2.13a), while avoiding 
the latter’s analytic challenges. In string theory, the incoming state |in) consists of a 
choice of string state for each of the m circles, so |in) € HS”; similarly Jout) € H®. 
Segal’s operator S(X) is none other than the (finite) scattering matrix, or the time- 
evolution operator e'“' (holomorphically extended to imaginary time): the desired string 
amplitude is(out|S(X)|in). This is what Segal is trying to capture formally. 

If È is the disjoint union of surfaces £; and Xz, then S(X) = S(D1) @ S(X2). That 
the fundamental identity (4.4.2) should hold can be seen by cutting open a Feynman path 
integral: an integral over all paths starting from « at time 0 to w at time 1 can be expressed 
as the integral over all possible jz of all paths starting from « at t = 0 to u att = 0.5, 
and all paths from u att = 0.5 to w att = 1. This is just matrix multiplication, as (4.4.2) 
suggests. A physical description of sewing can be found, for instance, in section 9.3 
of [253]. To construct the projective factor in (4.4.2), Segal uses the “determinant line 
bundle’ [192] (see e.g. [498], [502] for details). An alternate approach to central charge 
c # 0 within the Segal formalism is given in lecture 2 of [241]. 

Another kind of sewing occurs when two oppositely oriented boundary components 
of & are sewn together, increasing the genus by 1, as is illustrated in Figure 4.16. 
Algebraically, this corresponds to taking a trace or a sum using the Hermitian form. (To 
see why this is compatible with (4.4.2), interpret matrix multiplication as a trace of the 
tensor product of the matrices.) 

Segal’s use of surfaces with boundary differs from that of Section 4.3.2. Usually, 
quantum field theory restricts to the (easier to calculate) limiting case where the incoming 
and outgoing states are at t = F00. This is the strategy followed in Section 4.3. Segal 
is instead trying to capture the string amplitudes for finite times, because it makes the 
Vir action manifest, as we’ll see shortly. The relation of Segal’s picture with that of 
enhanced compact surfaces is made in pages 6-7 of [295]. 

The multiplication Co. x Co,2 —> Co,2 makes the annuli space Co » into an infinite- 
dimensional complex Lie semi-group (it has no identity and inverses). Its multiplication 
is described explicitly in section 9 of [448], but to get a taste for it, forget temporarily 
the parametrisations on the boundary circles: then the sewing of annuli r < |z| < 1 and 
r’ < |z| < 1 obviously yields the annulus rr’ < |z| < 1, and so this annulus semi-group 
is isomorphic to that of the interval (0, 1) under multiplication. Recall from Section 3.1.2 
that the complex Lie algebra Witt has no Lie group, or equivalently that the real Lie 
group Diff*(S!) has no complexification. The semi-group Co,2 should be regarded as 
the complexification of Diff* (S'); it plays the same role for Diff” (S!) that the punctured 
disc 0 < |z| < 1 plays for S!. One hint of this is (4.4.1). Another (proposition 3.1 of 
[502]) is that there is a one-to-one correspondence between positive energy projective 
representations of Difft(S ') (recall their definition in Section 3.1.2) and holomorphic 
projective representations of Co,. The positive energy representations of Difft(S!) are 
the only ones with a hope to extend to Co,2, and all of them are necessarily projective. 
By aconjecture of Kac, these are all highest-weight modules. 

In applications to string theory (namely in the presence of ‘ghosts’), the positive- 
definiteness of the Hermitian product in the Hilbert space should be weakened. Also, 
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one may wish to supersymmetrise the state-spaces, that is, give them a Z-grading (in 
order to include fermions). See [502] for some comments along these lines. 

Note that there is an action of Co,» on each Cy — in fact, one for each boundary circle. 
This semi-group action amounts to lengthening the arms of each end (equivalently, 
shrinking the boundary circle); physically, this corresponds to time evolution t > oo 
of outgoing states, or time devolution t — —oo of incoming states. We are used to 
time evolution being a unitary (hence invertible) process, but here time is imaginary, 
that is, space-time is Euclidean, so time evolution is a contraction. As mentioned in 
Section 4.2.4, Euclidean space-time is better behaved mathematically than the more 
physical Minkowski space-time, though in a healthy quantum field theory they should 
be equivalent. 

This semi-group action is the integration of the action of Witt on the moduli spaces 
(Section 3.1.2). By (4.4.2), this action means that each space S(Cy) carries a projective 
Co,2-representation. In particular, we get an action of Co,2 on the state-space H, projective 
if c Æ 0. This is how we recover the representation of Vir 6 Vir on H that is so important 
in Section 4.3.2. 

The higher-genus behaviour of an RCFT is determined from the lower-genus 
behaviour, by composition of ‘arrows’ (i.e. the sewing together of surfaces) in cate- 
gory C, as we see in Figures 4.12 and 4.16. Note that several different sewings can yield 
the same surface. That they must each give the same answer turns out to be a powerful 
constraint on CFT, called duality (Section 6.1.4). 

Thanks to sewing, a CFT is uniquely determined by the chiral algebras V, V; the 
1-loop partition function (which gives the spectrum of the theory, i.e. the structure of H 
as a V @ V-module); and the OPE (4.3.2) (see e.g. section 4 of [502]). 

The simplest interesting example here is the ‘tree-level creation of a string from 
the vacuum’, i.e. © : Co —> Cı. In this case the world-sheet looks like a bowl, that is 
homeomorphic with a disc D, and so is associated with a linear map S(D) : C > H. 
Equivalently, S(D) is the assignment of the vector S(D)(1) in H to D. In the case of 
the standard unit disc (i.e. where D = {z € C||z| < 1} and the parametrisation of the 
boundary S! is simply 6 +> e?™'®), this vector is called the vacuum state |0). In section 9 
of [502] it is explained how to recover the stress—energy tensors T (z), T), by deforming 
the complex structure on the disc; this idea is borrowed from CFT. 

For another important example, a surface X : C2 —> C4, that is a pair-of-pants, cor- 
responds to a bilinear map H @ H — H, and makes 7 into an algebra. Choosing & 
appropriately, this gives the OPE (4.3.2). A different choice defines the physical vertex 
operators (this is explicitly given on page 770 of [241]). 

Finally, suppose the initial and final objects here are both Co, so the world-sheets £ are 
closed Riemann surfaces. Segal’s functor S(X) is a linear map C — C, so is completely 
determined by its value at 1 € C. This value S(X)(1) =: Z(£) € C is the partition 
function. Consider now © a torus. Up to conformal equivalence, X can be written as 
the quotient &,; := C/(Z + Zr), and so the 1-loop partition function Z(X,) becomes a 
function on H. As we know, ©, and Xg., are conformally equivalent when a € SL2(Z), 
and so Z must be modular invariant. We can construct a torus by sewing together the two 
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ends of a cylinder, or equivalently an annulus A, = {z € C| |q| < |z| < 1} for q € C 
where the boundaries are parametrised by ge?" and e?7'°. We know that this recovers £, 
up to conformal equivalence, if q = e?”'*. Then S (Ag) = qrg and so by the sewing 
axiom (with c = 0 for convenience) the torus partition function becomes 


Z(t) = tryg g”. 
It must be invariant under the usual action of SL? (Z). Of course, if the central charge is 
nonzero, then the sewing axiom picks up a multiplicative factor that recovers (4.3.8b). 
See page 768 of [241] for details. 

So far Segal is addressing general CFT. He defines an RCFT — our main interest — as a 
modular functor 8. It assigns to each surface & its space of chiral blocks (4.3.7). Let ® be 
a finite set of labels — this parametrises the irreducible modules of chiral algebra V. One 
of these labels, call it 0, is distinguished (it corresponds to the vacuum, and was called V 
in Section 4.3.2). We require that ® has an involution i +> i*, called charge conjugation 
and related to complex conjugation. By a labelled Riemann surface with boundary (£, œ) 
we mean to assign a label œ; € ® to each (parametrised) boundary circle of £X. These 
are the objects in a category Riemo. The morphisms are ‘holomorphic collapsing maps’ 
(see section 5 of [502]), which sew together pairs of boundary circles in the usual way. 
The target is the category Vect; of finite-dimensional vector spaces, since the spaces of 
chiral blocks live there; morphisms are linear transformations. 


Definition 4.4.1 [502] A modular functor is a functor $% from Riemg to Vectș, such 
that: 
(i) B takes the disjoint union = U ¥' to B(Z) 8 BX’). 

(ii) BCX) = B(—X), where ‘—X’ means that we reverse the orientation of all 
boundary circles of & (i.e. interchange incoming with outgoing circles), and also 
replace each label a; with its conjugate a*. 

(iii) Suppose surface È is obtained from surface X' by cutting along a closed curve. 
For each label i € ®, let X; be the surface X labelled the same as X', except its 
two additional circles are both given the label i. Then 


Pico BZ) = BX’). (4.4.3) 


(iv) If D is the standard disc then B(D) is C if the boundary is labelled 0, and {0} 
otherwise. 

(v) Finally, if X is a family of surfaces varying holomorphically with a parameter w, 
then the spaces 8(X) fit together to form a holomorphic vector bundle. 


We won’t spell out precisely what condition (v) means (roughly, it says that the chiral 
blocks are holomorphic functions on the moduli space), but certainly it implies that the 
dimension of B(£) only depends on the orientations of the boundary circles and the 
labels, and not on the complex structure of X. We discuss chiral blocks in Section 4.3.3. 
Their most important property is that they carry a projective representation of the mapping 
class group of £. The definition of modular functor using closed surfaces with marked 
points, as well as an alternate approach to c ¥ 0, is given in chapter 5 of [32]. 
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Fig. 4.17 A natural depiction of an identity 2) 9523849586 = e€. 


There are still no known examples of modular functors, though it is expected that 
any sufficiently nice vertex operator algebra will yield one. Nevertheless, this picture of 
RCFT is incomplete, as it only captures some elements of the chiral halves of an RCFT. 
For instance, the modular functor corresponding to Monstrous Moonshine is trivial. The 
1-loop partition function (4.3.8b) is important data for the RCFT, but its presence here 
is obscure (to this author at least), as more generally is the explicit relation between the 
full CFT and the two chiral halves. 


4.4.2 Groups are decorated surfaces 


This short subsection motivates topological field theory and can be skipped on first 
reading. 

Fix a group G. We can think of G as a set of identities gj 92 --- g, = e. Conjugating 
by g1, we observe 


2182°°+ 8k = e iff 89283 -< 8k81 =e. (4.4.4) 


Thus, an identity ‘g1 --- gy = e’ in G really should be written circularly, as in Figure 4.17. 
In other words, we can think of G as a way to assign to each polygon, whose sides are 
labelled consecutively by elements g; of G, a number P(g1, 282, ..., 8k) € {0, 1}. We 
assign ‘1’ to a given labelled polygon if, starting anywhere on the circumference and 
reading counterclockwise, the product of the labels equals e; otherwise assign ‘0’ to it. 
We get a dihedral symmetry, 


P81, 82, ---, 8k) = P(g2,.-+s 8k, 81), (4.4.5a) 
Pi Bigg Bt) =P (Bp stave T (4.4.5b) 
corresponding to the symmetries of the k-gon. 


Of course not every assignment of 0’s and 1’s to labelled polygons will come from 
groups. Most importantly, we have 


P(g, ie dee | 8m, hi, Le: hn) = X Pegi, Hehe i Em, g) P(g, hı, et n i hy). (4.4.5c) 
geG 
This can be depicted pictorially as the dissection rule of Figure 4.18. We also get the 
normalisation rule 


Pei... geg) = 1. (4.4.54) 
geG 


This polygonal definition is completely equivalent to the usual one of a group: 
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Fig. 4.19 Associativity in a group. 


Proposition 4.4.2 Let S be a set and let TI(S) be the set of all polygons labelled 
with elements of S. Suppose P : TI(S) — {0, 1} obeys all equations (4.4.5), where for 
g € S,‘g97} denotes the unique element of S satisfying P(g, g7!) = 1. Define e € S by 
P(e) = 1 and the multiplication ‘gh’ by P(g, h, (gh)~') = 1. Then this defines a group 
structure on S compatible with the values P of the polygons in TI(S). 


Thus, knowing the values of 1-gons, 2-gons and triangles fixes all other values. Asso- 
ciativity is equivalent to Figure 4.19, and all other generalised associativity relations 
can be derived from it. The entire group structure is encoded in a few polygons — the 
rest are redundant — and indeed that is how a group is usually defined. But there is an 
aesthetic appeal to considering this global (albeit highly redundant) structure provided 
by all identities in G, and this charm is lost if we focus only on the banal building blocks. 
It is reminiscent of interpreting the presentation (1.1.9) as a group of braids. 

Nevertheless, this rephrasing of the definition of a group is unsatisfactory for several 
reasons. It seems artificial that the values P are always either 0’s or 1’s. Why should 
we limit the right side of (4.4.4) to being e — for example, any central element will 
work equally well. Can we consistently sew together two sides of the same polygon, 
and get more interesting topologies? What does the normalisation condition really mean 
group-theoretically? These thoughts lead to the following construction. 

Fix a group G and irreducible character ch (Section 1.1.3). A polygon whose sides 
are labelled with elements g; of G is assigned the complex number P(g\,..., 9%) = 
ch(e) 


Tay CMS +++ gx) (recall that ch(e) is the dimension of ch). Equation (4.4.5a) continues 


to hold, while (4.4.5b) becomes P(g, ..., 8) = P(g, wih gr’). Equation (4.4.5c) 
follows from the generalised orthogonality relation (theorem 2.13 in [308]) 
1 ch; (h) 
|G || " chj(e)’ 


Š chi(gh) chj(g~!) = 6 


gEeG 


Mathematical formulations of conformal field theory 305 


valid for irreducible ch; , ch;. The ‘normalisation condition’ (4.4.5d) should be replaced 
by 
ch(e)? 

IG 


SIPs. ge Ol = 


geG 


where ch = $ m;ch; expresses ch as a sum of irreducible characters. We see that, as 
before, two consecutive arcs, labelled g, h, can always be replaced by a single arc labelled 
gh;soa polygon can always be replaced with a disc. Moreover, the label on a disc depends 
only on the conjugacy class. 

More generally, we can use any character of the form ch = }` ch;(e)ch;, where we 
sum over any subset of the irreducible characters; then P = ch/||G|| works. For instance, 
the original assignment (with values in {0, 1}) corresponds to the character ch of the 
regular representation of G. The normalisation condition (4.4.5d) is thus seen to be a 
consequence of orthogonality of characters. 

There is no need to stop here. The dissection rule applied to an annulus labelled with 
conjugacy classes K,, Kn (h inner, g outer) implies it is assigned ch(g)ch(h); more 
generally, a disc with n smaller discs removed will have value ch(g)ch(h)) - --ch(hy). 
In these more general settings, the orientation of the boundary circle should be made 
explicit (here they’re all taken to be counter-clockwise). A torus with a disc removed, 
and the boundary circle labelled K,, has value lel ch(g). 

Likewise, any surface with (oriented) punctures labelled by conjugacy classes can be 
assigned a well-defined complex number. This is, in fact, a slightly enhanced topological 
field theory (Question 4.4.4). 


4.4.3 Topological field theory 


The essence of mathematics involves seeing that two different-looking things are actually 
(from the appropriate perspective) the same. What are different ways of going from point 
a to point b? In algebra these are functions, the simplest being linear; in geometry, these 
are cobordisms; in physics, this is time evolution. A topological field theory is their 
identification. 

This subsection strays a little from the main thread of this book, and so we will only 
sketch the basic idea. The following definition, that topological field theory is a monoidal 
functor from the cobordism category to Vect s, is due to Atiyah and was heavily influenced 
by Segal’s definition of CFT (Section 4.4.1). Topological field theory is a beautiful 
language that has elegantly formulated several deep mathematical ideas (e.g. Morse 
theory, the Jones polynomial, Donaldson invariants) — see the reviews [25], [564], [62], 
[534], [32]. The first topological field theories were constructed in physics by Schwarz 
(1978) and Witten (1982) (see [62] for references). Physically, a topological field theory 
should arise from the large-distance limit of any quantum field theory with mass gap. 


Definition 4.4.3 [25] A topological field theory in d + 1 dimensions assigns to each 
compact oriented smooth d-dimensional manifold & a finite-dimensional complex vector 
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Fig. 4.20 Sewing. 


space T (£), and to each compact oriented (d + 1)-dimensional manifold M with bound- 
ary X=, a vector T(M) € T (È), such that: 
(i) T(4*) = T(2)*, where X* denotes £X with opposite orientation, and T(X)* is 
the dual space. 
(ii) T takes the disjoint union £; U X> to T (£1) ® T (2). 
(ii) Jf OM; = ÈX U X; (disjoint union) and M is obtained from M, and M, by sewing 
along a common boundary component ©, as in Figure 4.20, then 
T(M) =T(M>2)0T(M)). 
(iv) T takes the empty d-manifold Ø to C. 
(v) TŒ x J) is the identity endomorphism of T (£), where I is the unit interval. 
(vi) If f : E = X' is a homeomorphism, then there is a vector space isomorphism 
T, :T(2) > T(x), if F : M > M' is a homeomorphism, then 


Tr\yy(Z(M)) = TM’). 


Some technicalities are implicit here; see section 4.2 of [32] for any needed clarifications. 
The book [534] is also helpful. If the boundary of M is X, and we write X as the 
disjoint union £; U Xo, then Z (£) = 7 (£1) ® T (£3)* and thus the ‘vector’ T(M) can 
be regarded as a linear map T(X3) > T (£1). This functional interpretation is implied 
in (iii) and (v). 

M plays the role here of space-time, and & that of space (i.e. a space-like time-slice of 
M). T(%) is the space of all states at the given instant, while the map Z(M) is the time- 
evolution operator e”! . Condition (iv) can be interpreted as saying that the Hamiltonian 
H is 0, so the only evolution is topological. 

Question 4.4.6 asks for a proof of the homotopy invariance of 7. This means that 
the mapping class group of ©, that is the group of components of the group Diff*(x) 
of orientation-preserving diffeomorphisms, acts on the space 7(). This is obviously 
important to us. 

Condition (iv) is needed to eliminate the trivial theory. If M is a closed manifold (i.e. 
it has no boundary), then T(M) € T(@) = C. Thus a topological field theory assigns a 
numerical invariant to closed (d + 1)-dimensional manifolds. 

Let X be any d-manifold and put Mı = & x J, M2 = &* x I. Sewing these together 
along corresponding copies of ©, we get M = E x S!. From (v) we get that 7 (M;) are 
the identity maps 7 (£) > T (£) and T(X)* > T(X)*, respectively. But we can also 
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think of them as vectors in 7 (£) @ T(X)* and T(£)* ® T(d), so these vectors must 
be `; e; Q e* and ae, e; ® e;, respectively, where e; is any basis of 7 (£) and e* is the 
dual basis. Thus 


T(x x S!) = (x eet, > Ag “| = dim(T(2)). (4.4.6a) 


J 


Now, we know that ‘dimension’ can be twisted into ‘character’ whenever a group is 
present. So let y lie in the mapping class group and define X x, S l to be the (d + 1)- 
dimensional manifold obtained by sewing & x J to &* x I by identifying the boundary 
x* x 0 with & x O and y(%) x 1 with © x 1. Repeating the earlier calculation yields 


Tee = (x T, (ei) @ et, Y e Be) = te(T;). (4.4.6b) 
i J 


Theorem 4.4.4 A topological field theory in 1 + 1 dimensions is equivalent to a finite- 
dimensional commutative associative algebra A over C with unit 1, together with a 
linear map tr: A > C such that the bilinear form (a, b) +> tr(ab) is nondegenerate. 


Nondegenerate here means that the only a € A with tr(ab) = 0, Vb € A, isa = 0. Such 
an algebra A is called a Frobenius algebra — see, for example, chapter 2 of [353]. 
Frobenius algebras were introduced by Frobenius in 1903. The association of a Frobenius 
algebra to a (1+1)-dimensional topological field theory is straightforward. The vector 
space A is given by 7 (S!). The boundary of the disc D can be thought of as 3D = S! 
or ðD = Ø U (S!)*, the former interpretation defines a vector 1 := T(D) € A, while the 
latter defines the map tr:= 7 (D) : A —> C. The product structure on A comes from T 
applied to a pair-of-pants, with boundary S! U (S! U S')*. The various properties obeyed 
by multiplication, 1 and tr follow inductively from the various pictures — it’s a good idea 
for the reader to work these out. The proof that a Frobenius algebra defines a unique 
and well-defined topological field theory is based on the fact that any surface can be 
obtained by sewing together discs, cylinders and pairs-of-pants; the only difficulty is 
verifying well-definedness: as we know, the same surface can be decomposed this way 
in many different ways. The details of this proof are given in section 4.3 of [32]; see 
also section 3.3 of [353] for a more pedagogical treatment. This proof is practise for 
Section 6.1.4, where we do the same for RCFT. 

Our symbol ‘X’ in Definition 4.4.3 is due to the special importance of d = 2. In 
his analysis of the Jones polynomial, Witten discovered the explicit relation between 
topological field theory in 2+ 1 dimensions and CFT (in the usual two dimensions): 
the spaces 7 (£) are the spaces B(£) of chiral blocks. The relation between CFT and 
(2 + 1)-dimensional topological field theory is carefully explained in chapter 5 of [32]. In 
particular, the association of a modular functor with a topological field theory is easy, but 
(according to [32]) the association of a topological field theory with each modular functor 
is only conjectural at present. (2 + 1)-dimensional topological field theory has been used 
recently in a series of papers (see [211] for a review) for constructing (boundary) RCFT 
correlation functions. 
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4.4.4 From amplitudes to algebra 


The final rigorous approach to CFT we sketch reconstructs the chiral theory directly 
from the vacuum-to-vacuum amplitudes. The physical appeal of this approach is that 
it starts with ‘observational’ data. For us, it’s excellent motivation for the material of 
the next chapter. We focus on the chiral halves of RCFT — the parts of CFT of greatest 
interest to mathematics. 

A chiral half of a CFT on a sphere consists of a state-space H and a collection 


(Y (Wi, a1) Y (Wa, z2) + -Y Wn, Zn)) (4.4.7) 


of correlation functions, where z; lie in the Riemann sphere P!(C) = C U {oo}. To avoid 
circularity, restrict (4.4.7) to states w; in some (typically finite-dimensional) subspace 
Heen that generates H. For now, all we need to know about these correlation functions 
(4.4.7) is that they are multi-linear in the states Y;, symmetric under permutation of y; 
and analytic in the points z;, except possibly for poles when z; = z;. At this point the 
notation in (4.4.7) is purely formal, so for example ‘Y (w;, z;)’ has no meaning. Our first 
task is to associate with the states Y € Hgen, vertex operators Y (y, z). 

Let O be any open set in P!(C), with the property that its complement is path- 
connected and contains a disc. A counterintuitive result of axiomatic quantum field theory 
(the Reeh—-Schlieder Theorem [518], [269]) says that the states ` g\(f1)--- @n(fn) 10) 
generated from the vacuum |0) by fields g; smeared by test-functions f; localised to O, 
will be dense in H. This observation motivates the following construction. 

Define the space Vo formally spanned by all words Y (Y1, z1) +-+- Y (Wn, Zn) 10}, where 
W € Hgen and z; € O, z; pairwise distinct, and we require any word to be bilinear and 
symmetric in the w;. We want to complete these infinite-dimensional spaces (i.e. include 
the limits of certain sequences), topologise them (i.e. decide when vectors are ‘close’ ) and 
identify vectors that are physically indistinguishable (i.e. quotient by null-vectors). We 
can do all three, using the amplitudes (4.4.7) to define a bilinear pairing Vo x Vo > C, 
for any open set O’ in the complement of O: 


(ry (VAP. 21) Y (Yor zmo) 10), DO Y (0, wh?) r (Ay wn) o) 
i J 
=>) (x (v, z) san ¥ (o. wy) (4.4.8) 
i,j 


for all Y, a €E Hgen, z® €O, wP € O’. A topology on say Vo is obtained by 
defining this pairing to be continuous. We identify vectors in Vg by quotienting by 
those vectors in Vg, that are orthogonal to all of Vg. This pairing (4.4.8) also allows us 
to complete the space Vo. The resulting space turns out to be independent of O’ — call 
it V? . See [227] for details. 

If Oi C Oz, then we get a natural continuous embedding of V°? into V?! ., The role 
here of the space H of states is played by this collection V of topological vector spaces, 
just as the role of the algebra A of observables in quantum field theory is played in 
algebraic quantum field theory by the net A(O) (Section 4.2.4). However, if O c P!(C) 
contains oo but not 0, then we can define the modes yn), for Y € Hgen, in the usual 
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way and from this get a Fock space H? C V°? spanned by all (Wi), +> > (Wedin,)/0). It 
is easy to see that it is dense in V? and independent of the choice of O. This Fock space 
will be the VOA (Definition 5.1.3) of the CFT. 

It is now easy to define the vertex operators Y(y, z). Choose any w € Hgen and 
z € O, and any subset O’ C O with z ¢ O’. Then the operator Y (y, z) : V? > V® is 


defined by 
Ly (H?.2?)--¥ (v2.29) 10 
> Droa (yP, zP) Y (vh, 28) 10) 


(there is a little work to see that this operator lifts from Vc to V? — again see [227]). 
Note that we automatically obtain commutativity: the identity 


Y (Y, z)Y(ġ, w) =Y ($, w)Y (Y, z) 


holds in V? provided z, w € O, z Aw, Y, @ € Hgen (compare va4 in Definition 5.1.3). 

So far we have assumed only the most basic properties of the amplitudes (4.4.7). The 
full splendour of CFT begins to reveal itself once we impose Möbius invariance, which 
says that it shouldn’t matter how we identify the sphere P! (C) with the complex coordi- 
nates C U {oo}. This invariance implies the usual Möbius covariance of the amplitudes 
and vertex operators. It allows us to extend the definition of vertex operators to, for 
example, V, and to establish Jacobi’s identity (5.1.7a). Although this is where things 
start getting interesting, this is where we leave off. 

We know the state-space of the CFT is a module for the chiral algebra. This is 
recovered in this formalism through the two-point functions, which are of the form 


(Y"(@2, wo) Y (Ya, 21) -+ Y (Wm, Zm) ¥'(gi, wi), (4.4.9) 


where the states y; lie in Hgen as before, and g; lie in spaces W; (which we can take to 
be dual to each other, although this isn’t necessary). We can construct spaces W? much 
as before, generated by 


rh, z1) -+ Y (Wms Zm) Y'(91, w1) |0), 


and interpret the symbol Y'(91, w1) as a vertex operator sending W? —> W°’, much as 
before. This leads quite naturally to the notion of a VOA-module (Definition 5.3.1). 

An observation that will be helpful in Section 5.3.2 in motivating Zhv’s algebra is that 
each representation corresponds to a linear functional on the chiral algebra: 


Proposition 4.4.5 [429], [227] The amplitudes (4.4.9) define a representation of the 
chiral algebra V, provided that for each open O with path-connected complement, and 
each states p; € W; and points wi ¢ O, there is a state v = v(91, G2, w1, w2) € y? 


satisfying 
Y" (g2, w2) Y (Yı, zı): g Y (Ym, Zm) Y'(91, w1)) == (Yn, 21): a Y (Wm, Zm) v) 
for all choices of zi € O, Wi € Hgen. 


310 Conformal field theory 


The proof of the proposition isn’t difficult (see theorem 6 in [227]). This proposition 
permits us to characterise the representations of a chiral algebra by states v. It turns out 
that these v, which can be interpreted as linear functionals on the Fock space H?’ using 
the pairing (4.4.8), vanish on a certain large subspace aan of H®’, and so define linear 


functionals on the quotient H? /O2 ase In the case of a rational CFT, this quotient space 
will be finite-dimensional and is called Zhu’s algebra (Section 5.3.2). 


Question 4.4.1. What is the value 8(S7) that Segal’s functor associates with the sphere? 


Question 4.4.2. Suppose labelled surfaces £ and X’ are sewed end-to-end (so the corre- 
sponding labels match, and the corresponding circle orientations are opposite), to pro- 
duce a new labelled surface £”. Construct a canonical map B(£) 8 B(X’) > B(x”). 
If B(2), B(X), B(X") are all nonzero, can that map be identically 0? 


Question 4.4.3. (a) Let A be any annulus with oppositely oriented boundary circles. 
Prove B(A) = {0}, unless both circles are given the same label i € ®, in which case 
B(A) = C. 

(b) If T is any torus, prove that 8(T) has dimension equal to the cardinality of ®. 


Question 4.4.4. Find a relation between the assignments P* to surfaces with punctures 
labelled with conjugacy classes of G, and two-dimensional topological field theory. 


Question 4.4.5. (a) If M is the disjoint union of M, and M3, what is T(M) in terms of 
T(M;)? 
(b) What does 7 send the empty (d + 1)-manifold Ø to? 


Question 4.4.6. Prove that if f : E — YX’ isahomeomorphism homotopic to the identity, 
then the linear map 7; of (vi) is the identity. 


Question 4.4.7. Classify all topological field theories of dimension d = 0. 


5 


Vertex operator algebras 


Vertex operator algebras (VOAs) are a mathematically precise formulation of the notion 
of chiral algebra (Section 4.3.2), the symmetry algebra of conformal field theory. They 
constitute the simplest expression we have of the machine that associates the Monster 
M with the Hauptmoduls. VOAs were first defined by Borcherds, and their theory has 
since been developed by a number of people. We begin with the rather complicated 
definition, before turning to our greatest interest: their representation theory. The final 
section sketches some relations of vertex algebras to geometry. See, for example, [201], 
[330], [197], [376] for more complete treatments; a more physically minded introduction 
is provided in [242]. 

Vertex operator algebras are not a type of operator algebra; rather, they are an algebra 
of vertex operators. Vertex operators arose first in string theory back in the early 1970s 
as a device for computing string amplitudes. They appeared independently in the mathe- 
matical literature (starting with [377]) in order to realise affine Kac—Moody algebras and 
their modules as algebras of differential operators. Today, just as we define a ‘vector’ 
to be an element of a vector space, we define a ‘vertex operator’ to be a formal power 
series Y (u, z) appearing in a vertex algebra. 


5.1 The definition and motivation 
5.1.1 Vertex operators 
In bosonic string theory, the vertex operator (Section 4.3.1) corresponding to the absorp- 


tion of a tachyon with momentum k = (k") at world-sheet position z and space-time 
position X (z) = (X“(z)) is the normal-ordered expression V (k, z) = :e«*© :, Write 


' 1 
X"(z) = x“ — ip“ log(z)+ i >» akz", 
n 
n#0 
where x” and p“ are classically the position and momentum of the string’s centre-of- 
mass, and a its oscillation coordinates. Then the vertex operator is (chapter 2.2 of 


[261]) 
Vik, z) = exp | k- D> 82") zk P-lelt exp | -kX Zz). 5.1) 
n>1 n n>1 " 


Independently, Lepowsky and Wilson realised the affine algebra A"! using differ- 
ential operators (they tried to do this because finite-dimensional Lie algebras often act 
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as differential operators, for example, on the space of functions on an associated Lie 
group): 


Theorem 5.1.1 [377] A basis for the affine algebra A,“ consists of the operators 


ð 13 5 1 
1l, Yn, —, Y Yn € | =,=, 5, f}, kE -Z, 
igg E {5 2'2 | 2 
thought of as operators on the space C[y1/2, Y3/2, Y5/2, - - -| of polynomials in the y, (we 
are ignoring the derivation in A,“)). The differential operators Y, are the homogeneous 

components of the formal generating function 


Y(z)= > Y, z* = exp (x i.) exp (-» i) . (5.1.1b) 


ke5Z 


In particular, (ignoring the derivation £o) A,” is spanned by a central term C, as well as 
e Qt”, f @t",h Qt” foreachm € Z (Section 3.2.2). In Theorem 5.1.1, 1 corresponds 
to C. For each n € N + 1/2, the operators y, and 0/dy, correspond (up to numerical 
proportionality factors) respectively toe @ tF”—'/2 + f @rt"t!/?, and Y4, corresponds 
to —e @ f*""1/2 + f @rt"*!/?, For k € Z, the operator Y; corresponds to h Q t* (for 
k#0)andh Q 1 — C /2 fork = 0. 

It was Garland who first recognised the formal resemblance between these tran- 
scendental expressions (5.1.la) and (5.1.1b). Note that when expanded out they both 
involve a sum over powers of z, unbounded in both the positive and negative directions. 
Doubly-infinite series scream of convergence difficulties. The fractional indices n, k in 
Theorem 5.1.1 are a signature of what we today call twisted vertex operators. 

The geometric meaning of the vertex operator is perhaps best explained in the context 
of the loop group (Section 3.2.6). Suppose the loop group LS! acts on some space H. 
For each 0 < s < 27 and € > 0, consider the loop yé € LS ' defined by 


leS! for |s —t| > 
exp (zi5t) eS! forse <t<ste’ 


v 6) = | 


for all 0 < t < 2x. In words, y£ stays at the identity 1 € S! for all time r, except for 
a small interval around f ~ s where the loop rapidly winds around S! once. This loop 
corresponds to some operator on H; the limit (appropriately taken) as € — 0 is an 
operator-valued distribution on H called a vertex operator (see chapter 13 of [465] for 
details). 


5.1.2 Formal power series 


As we saw last chapter, the basic object of quantum field theory is the quantum field. 
It is tempting to think of it as a choice of operator A(x) at each space-time point x, but 
‘function’ (or ‘section of a vector bundle’ for that matter) is too narrow a concept even 
in free theories. 
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The analytic way to make sense of ‘functions’ like quantum fields is through dis- 
tributions, and this was the approach taken in Section 4.2.4. We will describe now 
the algebraic alternative. These two approaches are not equivalent: you can do some 
things in one approach that you can’t do in the other, at least not without difficulty 
(Section 5.4.1). But as always the algebraic approach is considerably simpler techni- 
cally — there are no convergence concerns to address — and it is remarkable how much 
can still be captured. It was first created around 1980 by Garland and Date—Kashiwara— 
Miwa to make sense of doubly-infinite series like (5.1.1), and is now the language of 
VOAs. Good introductions to the material in this subsection are [201], [330], [376]. 

Keep in mind that in CFT we are trying to capture operator-valued ‘functions’ on 
two-dimensional Euclidean space-time (Section 4.3.2). Locally space-time looks like 
C; as explained in Section 4.3, we like to compactify the external legs — for exam- 
ple, for an incoming string tracing a cylindrical world-sheet, the space-time point 
(x, t) is associated with the complex number z = e’ +i 
toz=0. 

Let W be any vector space. Define WJ[[z*!]] to be the set of all formal series 
Moaie Wnz”, where the coefficients w, lie in our space W. We don’t ask here whether a 
given series converges or diverges; z is merely a formal place-keeping variable. We will 
also be interested in the space W[z*!] of Laurent polynomials, that is, expressions of 
the form TA. m Wn2". WI[z+!]] itself forms a vector space, using the obvious addition 
and scalar multiplication. 

Our aim here is to describe quantum fields, so we want our formal series to be operator- 
valued. To do this, choose W to be a vector space of operators (matrices if you prefer): 
W = End(V), for some space V. We are actually interested in V being the infinite- 
dimensional state-space of the theory, but in the following examples we take V = C, 
that is power series with numerical coefficients. 

We can now multiply our formal series in the obvious way. For example, consider 
V = Ç, and take c(z) = z?! — 5z! and d(z) = $? _„ z”. Then 


, So time t = —oo corresponds 


n=—CO 
c(z)d(z) = Se ea -5 5 git = 5 Ss z” = —4d(z). 
neZ neZ neZ neZ 


This simple calculation tells us many things. 


(i) We can’t always divide: c(z)d(z) = —4d (z) shows that the cancellation law fails 
and that C[[z*!]] isn’t even an integral domain. 

(ii) Try to compute the square d(z)*: we get infinity. That is, you can’t always 
multiply in W[[z*!']]. 

(iii) Working out a few more multiplications of this kind, we find that f(z) d(z) = f (1) 
x d(z) for any f for which f (1) exists (e.g. any Laurent polynomial f € W[z+!]). 
Thus d(z) is what we have called the Dirac delta ô(z — 1) centred at z = 1. (You 
can think of it as the Fourier expansion of the Dirac delta, followed by a change of 
variables.) So of course it makes perfect sense that we couldn’t work out d (zy -— 
we were trying to square the Dirac delta, which we know is impossible! 
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There is a certain divergence of notations here: should ô be written additively (i.e. 
6(z — 1)), in the familiar way, or should it be written multiplicatively (i.e. 6(z)), in the 
more honest way? Throughout this chapter we use the multiplicative notation. So we get 


[o0] 
8G) =z". (5.1.2) 
n=—0O 
In fact, the best notation of all would be the awkward 6(z)dz, since the Dirac delta 
centred at z =a is )\,2z"a~""! =a !8(z/a). 

Making contact with Section 1.3.1, the Laurent polynomials (End V)[z*!] play the 
role here of the smooth functions C% with compact support, and the formal power 
series (End Y)[[z*!]] play the role of its dual. So these power series f € (End V)[[z*!]] 
are formal distributions — this is why f(z) usually diverges. The evaluation f(p) of a 
distribution f € (End Y)[[z*!]] on the test function p € (End Y)[z*!] is given by the 
‘formal residue’ Res,( f(z) p(z)) € End V, where 


Res; {2 ba] =b. (5.1.3a) 


neZ 


The idea is that, up to a factor of (27ri)~!, this would equal the contour integral of 
g(z) = >> b„z” around a small circle about z = 0, at least for meromorphic g. Hence 
Res, obeys many of the familiar properties of integrals, such as integration by parts: 


Res.(g ð- f) = —Res.(f 0:2), (5.1.3b) 
where 0, f is the formal (term-by-term) derivative of f(z). For example, the formal 
distribution a~*"ak$)(z/a) takes the test function f(z) to the value (—1)*(a* f Xa). 
Because of the usefulness of the notion of residue, we write 

f(z)= X anz" = D amz "T", (5.1.3c) 
neZ meZ 


where aim) = Res,(z” f(z)) = a—m-1 is called a mode. 
Similar remarks hold for several variables z;. The distributions are the formal series 


— ny Nk __ —m,-1 —m-1 
Fises = Qn),..., nz, 000 Ze = Aamy,..., m2 1 see Zy 


njeZ meZ 


in W[[z; eee ZR ')], and the test functions f(z, ..., z) E Wz} ORAS Z% 1] consist 
of those power series with only finitely many nonzero terms. The Dirac delta centred at 
zı = z3 is given by z5 '8(z1/z2) = z7 '8(z2/z1). 

But we must not get overconfident: 


Paradox 5.1 Consider the following product: 


8(z) = (x: “) (i= >| (z) = (x “) [A — z) 6(z)] = (x “) [0 5(z)] = 0. 


n>0 n>0 n>0 


When physicists are confronted with ‘paradoxes’ such as this, they respond by tread- 
ing with care when they are involved in a calculation reminiscent of the paradoxes, 
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and otherwise trusting their instincts. Mathematicians typically over-react: after kicking 
themselves for walking head first into a ‘paradox’, they devise a rule absolutely guar- 
anteeing that the paradox will always be safely avoided in the future. We will follow 
the mathematicians’ approach, and in the next few paragraphs describe how to avoid 
Paradox 5.1 by forbidding certain innocent-looking products. 

Recall that we are actually interested in the space W = End(V). We call infinitely 
many linear maps w® € End(V) algebraically summable if for every vector v € V, only 
finitely many values wv e V are different from 0. In other words, fixing a basis for 
VY, only finitely many of the matrices w® have a nonzero first column, only finitely 
many have a nonzero second column, etc. The usual notation ‘5°, w®’ will denote the 
well-defined endomorphism sending each v € Y to that effectively finite sum >, wv. 

Consider a family (possibly infinite) of formal series w(z) € W[[z*!]]. We certainly 
have a well-defined sum `, w® (z) if for each fixed n, the set {w®} (as i varies) of maps 
is algebraically summable. We shall call such a sum algebraically defined, and write 


ye w(z) = 5 (x vP’) z", 


neZ i 


All other sums are forbidden. Likewise, we certainly have a well-defined product 
TTL, w® ©) of finitely many formal power series if for each n, the set 
1 2 m 
[wh Wea °° Win by nn 
(vary the n; subject to the constraint $`; n; = n) is algebraically summable. Again, call 
such a product algebraically defined and set it equal to 


[[¥®°@ = ai T WPu... ve z”, 
i=1 neZ \ny+e-+Nm=n 

where the second sum is over all m-tuples (n;) obeying 0, n; = n. All other products 

(e.g. all infinite ones) are forbidden. An algebraically defined product is necessarily 

associative. 

There are certainly more general ways to have a well-defined product or sum. For 
example, according to our rule, the series }_„ 27” would be forbidden. In this way we 
avoid the more complicated realm of convergence issues. In short, we are doing algebra 
here, and don’t want to be distracted by the dust clouds kicked up by mere analytic 
concerns. Such restrictions are common in infinite-dimensional algebra (recall footnote 
14 in chapter 1). The product of a distribution f € W[[z; ere: k 17] with a test function 
p € Wizi Bet ake k '] is always defined, and will be a distribution. The explanation of 
Paradox 5.1 is that although (` z”)(1 — z) exists and equals 1, and (1 — z) d(z) exists 
and equals 0, the triple product (` z”)(1 — z)8(z) is forbidden. 

A consequence of our algebraic approach is that the product 228(z) does not equal 


1 28(z) = 6(z) — their formal power series are very different. In hindsight this ‘failing’ is 
understandable: it is artificial here to prefer the positive root of 1 over the negative root. 
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Proposition 5.1.2 Let W be any vector space, and f € WI[z; L, 25 1)]. Then (zı — 
zo)" f (21, 22) = O for some integer N > 1, iff 


N-1 
f (21,22) = $ cj(@2) 04, 8(21/20), 


j=0 
where cj(z2) = Res-,((z1 — z2} f 1, 22)) € wji]. 
Proof: First, (zı — z2) f (Z1, 22) = 0 iff am—1,n = am n-1 Win, n, iff am n = a0,m4n 


Ym, n, iff 


f(@1, 22) = (x To ô(z1/22). 


neZ 
Also, for any j > 1, 
(21 — 20) 84, (2318(z1/z2)) = @1 — 22) $na- 1) a- j + Dz 


neZ 
= j Ə (z7 '8(z1/22)) . 


Hence 


M 
(21 = 2) f (21, 22) = X bj (22) a, (2781/22) 
j=0 
has general solution 


M 
ferz) = Y —— bCa) aF*(25'821/22)). 
mo] t 1 
E 
For reasons given next subsection, we call any formal distributions a(z), b(z) mutually 
local if f (z1, z2) := [a(z1), b(z2)] satisfies the condition in Proposition 5.1.2. In a vertex 
algebra or VOA (Definition 5.1.3), all fields are mutually local. 
We need ways to make new formal power series from old ones. First, for any n € Z, 
we define the binomial formula to hold: 


n p 
Ci +z)" =Y (C) zk ek (5.1.4a) 
keN 


where we define () = n(n — 1)--- (n — k + 1)/k! for any n. Equation (5.1.4a) lets us 
define, for any formal power series f(z) = }_„ anz” € Wiz! I], 


faitz):= > an () Zot eWl[[zt!, zo]. (5.1.4) 


neZ k>0 


Paradox 5.2 Expand (1 — z)~! ina formal series in z to get X` „o z”, and (1 — z)! = 
—z7! (1 — z271)! in a formal series in z~! to get — Y` „_ọ z". Subtract these equal 
expressions; we presumably should get 0, but we actually get 5(z). Similarly, applying 
(5.1.4a) to (1 +2)! = (z + 1)! again gives us the contradiction 0 = 8(z). 
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The analytic explanation is that the left expansions in Paradox 5.2 converge only for 
|z| < 1, while the second converges for |z| > 1, so it would be naive to expect their formal 
difference to be 0. We see from this that it really matters in which variable we expand 
rational functions. The seemingly harmless (5.1.4a) is actually a convention saying that 
we’ll expand in positive powers of the second variable. For instance, at first glance 


218 E=) z5'8 (254) zri ==) (5.1.5) 


is nonsense; it only holds if you expand the terms in positive powers of z2, z; and zg 


respectively. A rational function by itself does not define a unique formal power series. 
When we need to be explicit, we write 1,(f) to expand a rational function f in positive 
powers of z (i.e. for expanding it about z = 0). For example, 


1 1 
uf ) lz-1 ( ) = (z/w). 
w=z W—Z 


Recall the operator product expansion (OPE) of quantum fields (4.3.2), introduced 
to interpret pointwise products. Here we can study this more explicitly. For most pairs 
a(z), b(z) € (End V)[[z*!]], the naive product a(z) b(z) will not be algebraically defined. 
It is easy to prove directly from Proposition 5.1.2 (see theorem 2.3 of [330]) that if 
(zı — 22)" [a(z1), b(z2)] = 0, then 


N-1 ; 
a(zı)b(z2) = X a :a(z1)b(z2): (5.1.6a) 


oat ae 


separates a(z,)b(z2) into its singular and regular parts, where 


:a(z1) b(z2) := = t) b(z2) + b(z2) (x vt) ; (5.1.6b) 


n>0 n<0 
; N-I j! 

che) = 2 aG 5i“ je» bo- (5.1.6c) 
By 1/(z; — z2)/*! in (5.1.6a) we mean to expand z3 in powers from — j to oo. The point 
of (5.1.6a) is that the normal-ordered product (5.1.6b) is algebraically defined even at 
Z1 = Z? (Question 5.1.6) so any singular behaviour of a(z) b(z2) as Z1 — 22 is captured 
by the finitely many series c/. Equations (5.1.6) are the desired relation in CFT between 
the singular part of the OPE of quantum fields and the commutators of modes, mentioned 
in Section 4.3.2. The clarity that vertex algebras bring to quantum field theory (especially 
CFT) alone makes its definition worth all the pain. 


5.1.3 Axioms 


We are now prepared to introduce the important new structure called vertex operator 
algebras (VOAs). Although VOAs are natural from the CFT perspective and appear to 
be an important and rapidly developing area in mathematics, their definition is difficult 
and nontrivial examples are not easy to find. 
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A VOA is an infinite-dimensional graded vector space V = ®,>0V, with infinitely 
many bilinear products u *„ v respecting the grading (in particular Vg *, Ve C Vese—n—1)> 
obeying infinitely many constraints. We can collect all these products into one generating 
function: to each u € V associate the formal power series (a vertex operator) 


Y (u, z) := uz"! € (End VIA]. 


neZ 


For each u € V, the coefficients un) (called modes (5.1.3c)) are maps from V to V. The 
product u *„ v is now written Ugn) V := Uqn(v). The bilinearity of «, translates into two 
things: that u +> Y (u, z) is linear, and that each function v +> Uv is itself linear (i.e. 
Un) is an endomorphism of V). 


Definition 5.1.3 (a) Let V be a graded vector space V = X _ Vn such that each 
subspace V, is finite-dimensional. Suppose we have a linear assignment u +> Y (u, z) = 
Sez Umz "| from V into (End V)[[z*!]] and a distinguished vector 1 € V in Vo, 


obeying the following properties Vu, v € V: 


val. (grading) For u € Vk, Uin) is a linear map from Ve into Vkẹe—-n-1; 

va2. (vacuum) Y (1, z) is the identity (i.e. Layu = ôn,—1V); 

va3. (state-field correspondence) Y (u, 0)1 exists and equals u; 

va4. (locality) (zı — z2)¥ [Y (u, z1), Y (v, z2)] = 0 for some integer M = M (u, v); 
va5. (regularity) there is an N = N (u, v) such that uyv = Oforalln > N. 


Any such triple (V, Y, 1) is called a vertex algebra. The distributions Y (u, z) are called 
vertex operators, and the vector 1 is called the vacuum. 

(b) A vertex algebra (V, Y, 1) is called a vertex operator algebra (VOA) if there is a 
distinguished vector w € Vz such that 


voal. (conformal symmetry) Ln := @n+1) forms a Vit-module, whose central term C 
in (3.1.5) acts as c idy for some c € C; 

voa2. (conformal weight) Lov = nv whenever v € Vn; 

voa3. (translation generator) Y (L—ıv, z) = -Y (v, z); 

voa4. (CFT type) Vo = Cl and V, = {0} for all n < 0. 


The vector w is called the conformal vector, and c is called the central charge, conformal 
anomaly or rank. The grading n of u € V, is called its conformal weight. 

(c) A quadruple (V, Y,1, œw) is called a near-VOA if all axioms of a VOA are satisfied, 
except for VOA4, and in addition the homogeneous subspaces V, are allowed to be 
infinite-dimensional. 


We prefer the more descriptive name ‘conformal vertex algebra’ to the historical ‘ver- 
tex operator algebra’, although it is probably too late to dislodge the latter name. We 
study the Virasoro algebra in Section 3.1.2, where we discuss its relation to conformal 
transformations. We are more interested in VOAs than vertex algebras, since the Virasoro 
algebra is essential for the relation of V to higher genus and in particular to modular 
functions. The central charge c is an important invariant of V. The original axioms 
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[68] by Borcherds didn’t involve Yir nor require dim(V,,) < oo. The conformal axioms 
voAl—voa3 were introduced in [201] along with the name ‘vertex operator algebra’. 
Although voa4 holds for most important VOAs and yields the richest theory, it is not 
standard and is included here for simplicity. Note though that with it, va5 becomes redun- 
dant and can be dropped. The name ‘near-VOA’ is not standard; we need the notion in 
Section 7.2.2. 

In the physics literature, the vacuum 1 is often denoted |0), and in place of the expansion 
Y (u, z) = $` „ Umz "| foru € V appears the expression }`, Hing (s0 Ln = @ins). 
This new expansion cleans up some formulae a little; it has the disadvantage though of 
artificially favouring the ‘homogeneous’ vectors u € Vx. 

By Proposition 5.1.2, the peculiar-looking vA4 simply says that the commutator 
[Y (u, z1), Y (v, Z2)] of two vertex operators is a finite linear combination of derivatives 
of various orders of the Dirac delta centred at zı = z2; this powerful locality axiom is at 
the heart of a vertex algebra. A recommended exercise is to show that ina VOA, M = 4 
works in vA4 for u = v = w; more generally see Question 5.1.4. 

By V = @Y, here, we mean that any vector u € V can be expressed as a finite 
sum )~,,u(n) of homogeneous vectors u(n) € V,. To emphasise this finiteness, the 
notation 


v=] |v 


neN 


is often used. Note that in a vertex algebra, any series Y (u, z)v will be a finite sum — that 
is, the infinite sum Y (u, z) is algebraically defined (Section 5.1.2). 

An immediate consequence of val, vA2 and voa2 is that 1 € Vo and w € Vz — we 
needn’t assume these. 

Let V be a vector space with a linear map Y : Y > End V, such that Y (u) Y (v) = 
Y (v) Y (u). Also, assume that there exists a distinguished vector 1 € VY such that Y (1) is the 
identity, and such that Y(u)1 = u for all u € V. It isn’t hard to identify such a structure. 
Given any u,v € V, define the ‘product’ u xv to be the value Y (u)v. The linearity 
of Y : V — End V, as well as the linearity of each map Y (u), yields the distributivity 
laws. Also, l x u = Y (1)u = I u = u and u x 1 = Y (u) 1 = u, so 1 is a unit. Evaluating 
Y (u) Y (v) = Y (v) Y (u) on the right by w, gives 


u x (v x w) = Y (u) (Y (v) w) = Y (v) (Y (u) w) = v x (u x w). 


Substituting w = 1 gives u * v = v * u, that is the product is commutative. Likewise, u * 
(v x w) = u * (w x v) = w x (u * v) = (u * v) x w, so the product is associative. Thus 
a vertex algebra is an analogue of a commutative associative algebra with unit, where 
there is a product u *, v = Y (u, z) v at each point z in a punctured disc. A vertex algebra 
isn’t as obscure as it may first look. 


Theorem 5.1.4 The following are equivalent: 
(i) V is a commutative vertex algebra, i.e. Y (u, z1) Y (v, z2) = Y (v, 22) Y (u, 21) for 
allu, v € V; 
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(ii) V = OY OV, is a Z-graded commutative associative algebra with unit and 
derivation, with each dim(V,,) < 00; 

(iii) V is a vertex algebra where each vertex operator Y (u, z) involves only 
nonnegative powers of z, i.e. Um) = 0 for all n > 0. 


Proof: The equivalence (i) < (ii) was essentially established in the previous paragraph. 


(i) > (iii): Consider the equality 


X un zi =Yu,71)v = Y (u, 21) Y (v, z2) 1|2,=0 


neZ 


= Y (v, 22) Yu, 21) U.=0 = > v-nr- 124. 
n>0 
Since the expression on the right side involves nonnegative powers of z; only, the same 
must hold for the left side. 


mn 


(1) = (iii): For any power series f (z1, z2) = Sane, AmnZ}'25 € WI[z1, Z2]], Proposi- 
tion 5.1.2 implies that (z1 — 22)” f(z), z2) = 0 = f(z, Z2) = 0, since each residue of 
f (21, Z2) will be 0. Applying this to f(z1, z2) = [Y (u, 21), Y (v, z2)] gives the desired 
result. E 


Locality va4 can be rewritten in the form (see Section 3.2 of [376]) 
2516 ==) Y(u, z1) Y (v, z2) — z3 '8 (=) Y (v, z2) Y (u, z1) 
Zo —Zo 
=i Z1 — Zo 
=z; ô ===) Y (Y (u, Z0)v, z2), (5.1.7a) 
2 


where the formal series are expanded appropriately. This embodiment of commutativity 
and associativity in the vertex algebra is called the Jacobi identity since it plays an 
analogous role in VOAs as the Jacobi identity plays in Lie algebras. It corresponds 
directly to the duality of the sphere with four points removed (namely Figure 6.3(a)). 


Expanding it out, the coefficient in front of 2§2/’2% gives Borcherds’ identity: 


[ek 
Yen k ) (u(¢4m—i) © Vinti) — (—1) Veni) © Um+i)) 


i>0 
m 
=J (7) enoma (5.1.7b) 
i>0 
Specialising (5.1.7b) to £ = 0 and m = O, respectively, gives us 
m 
[un Vin) ] = Bi i ) (Ui V)m+n-i)» (5.1.7c) 
i>0 


fl 
(U@v)n) = Yen (C) (u-i) © Vingiy — (Dveri 0 UG) - (5.1.70) 


i>0 
In any vertex algebra, define an endomorphism T : Y —> V by 


Tu =u_pl. (5.1.8a) 


The definition and motivation 321 


This is the derivation of Theorem 5.1.4(ii). Indeed, applying (5.1.7d) to it and using VA2, 
we get Y (Tu, z) = -Y (u, z). Thus in any VOA, voa3 says 


Lu = uzl. (5.1.8b) 


Moreover, (5.1.7c) tells us that any u € V} automatically obeys [uq@, Y(v,z)] = 
Y (uœv, z). Thus in any VOA 


[L—1, Y (u,z)] = -Y (u, z). 


More generally, a more subtle argument (see e.g. proposition 3.1.19 of [376]) shows that 
in any vertex algebra, we have 


Y (u, z)v = eY (v, —z)u. 


These equations also allow us to compute explicitly the grading of unv in a VOA, 
recovering VAI: let u € Vg, v € Ve, then 


Loa) =v Um) =U (@A)V)+(@MW nyt (@@uanyv = (k + £—1n — lumnv. 
Duality (5.1.7a) also implies (see section 3.8 of [376]) 


1 
Y (umv, z) = m—D! ala” Ne, z)) Y (v, Zz), (5.1.9a) 


Y (unv, z) = Res, (zı — z)” [Y (u, z1), Y (w, z)], (5.1.9b) 


where m > 1andn > 0. As we see next section, this is quite useful as a way of obtaining 
the full VOA from a small number of generators. 

Unexpectedly, modular functions arise in VOA theory through the generating functions 
of the dimensions of the homogeneous spaces: 


CO 
tyg” = $ dim V, q". (5.1.10a) 
n=0 
We also see this important theme in, for example, Section 3.1.2. As in (3.1.10), a small 
refinement should be made: by the graded dimension x(t) of V we mean 


CO 
x(t) = trye?" Lo-e/24) — g—e/24 SS dim V, q”, (5.1.10b) 
n=0 


where as always q = e7”'*. The reason for the q +> t change-of-variables here will 
turn out to be the same as why Gauss and Jacobi introduced it into Euler’s generating 
function 1 + 2x + 2x4 + 2x° +--+: both the graded dimension of V and Euler’s gen- 
erating function are naturally associated with tori. Explanations for the now-familiar 
—c/24 shift are given in Sections 3.2.3 and 5.3.4. Incidentally, the term character is 
also used in the literature for xy(T), but Section 5.3.3 contains our diatribe against this 
misnomer. 

Section 1.5 illustrates the usefulness of the Killing form in Lie theory. Similarly, 
our VOAs all have a nondegenerate invariant bilinear form [199] — a bilinear pairing 
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(u|v) € C for u, v € V, such that 
Y (u, z)vlw) = (vl¥ (7 (—77°)u, zy), Vu, v, w € V. (5.1.11a) 


That this complicated definition is right is explained in remark 5.3.3 of [199] and equa- 
tion (54) of [242]. For such a form, the homogeneous spaces Y,, are mutually orthog- 
onal, symmetry (u|v) = (v|u) holds, and we recover familiar RCFT formulae such as 
(L,u|v) = (u|L_,v). It is known (section 3 of [380]) that there is a unique invariant 
bilinear form (up to a scalar factor), provided that V is simple (defined in Section 5.3.1) 
and 


LiV, =0 (5.1.11b) 


—both conditions are always satisfied by our VOAs. In this case the bilinear form restricted 
to each space V, will be nondegenerate. The most convenient normalisation is 


(j1) = -1, (5.1.12a) 
because for this choice the bilinear form on the homogeneous space V; becomes 
(u|v) = uiv, Vu, v E€ Vy. (5.1.12b) 


The invariant bilinear form plays an important role in CFT as well as Moonshine. 

By a vertex operator superalgebra we mean there is a Z2-grading of V = Vg ® Vi into 
even and odd parity subspaces, and for u, v both odd the commutator in, for example, 
Axiom vA4 is replaced by an anti-commutator. Their basic theory is very similar to that 
of VOAs (see e.g. [330]). For instance, we write 


X(T) := Xv, (T) — Xv, (T). 


Although we occasionally allude to vertex operator superalgebras (e.g. in Sections 5.4.2 
and 7.3.5), we won’t develop their theory. 

In RCFT, V would be the ‘Hilbert space of states’ (more carefully, V is a dense subspace 
of it), and z = e+ would be a local complex coordinate on a Riemann surface. Lo 
generates time translations, and so its eigenvalues (the conformal weights) are energy. For 
each state u, the vertex operator Y (u, z) is a meromorphic (chiral) quantum field. Y (œw, z) 
is the stress—energy tensor T . Physically, the requirement that V, = 0 for n < 0 says that 
the vacuum 1 = |0) is the state with minimal energy. Also, z = 0 in va3 corresponds 
to the time limit £ — —oo. The most important axiom, va4, says that quantum fields 
commute away from zı = z2, and so are local. It is equivalent to the duality of chiral 
blocks in CFT, discussed in Sections 4.3.2, 4.4.1, 6.1.4. 

In Segal’s language (Section 4.4.1), Y(u, z) appears quite naturally. Consider the 
virtual event of two strings combining to form a third. To first order (i.e. the tree-level 
Feynman diagram), this would correspond in Segal’s language to a ‘pair-of-pants’, or 
a sphere with three punctures, two of which are negatively oriented (corresponding to 
incoming strings) and the other positively oriented. We can think of this as the Riemann 
sphere C U {oo}; put the punctures at oo (outgoing) and z and 0 (incoming). Segal’s 
functor 7 associates with this a z-dependent homomorphism gy, : V x V — V, where 
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g.(u, v) = Y (u, z)v € V. Incidentally, the symbol ‘Y’ could have been chosen because 
of this ‘pair-of-pants’ picture (time flows from the top of the ‘Y’ to the bottom).! 

By voal, any VOA is a Vir-module. For most VOAs, this module is highly reducible. 
By a conformal primary v of conformal weight k we mean L,,v = 0 for all n > 0 
and Lov = kv for some k. These states are especially well behaved. Any such pri- 
mary generates a highest-weight module for Vir, on the space spanned by the elements 
Ln, +++ L—n,,v. The VOAs we are interested in are generated by the conformal primaries 
together with the operators L,, in the sense that V can be decomposed into a direct sum 
(usually infinite) of highest-weight Uir-modules. 


Question 5.1.1. Theorem 5.1.1 actually provides a realisation for a highest-weight rep- 
resentation of A;™®. Identify that representation. 


Question 5.1.2. Using the notion of algebraic summability, write down an algebraic defi- 
nition of lim,,_,,, F (Z1, Z2) valid for formal power series F (z1, z2) € W[[zj i Z3 17) real- 
ising the intuition of substituting in z1 = z2. Prove that lim; >z, F (z1, Z2) ‘algebraically 
exists’ iff the product F(z1,Z2)ô(z1/Z2) does, in which case F (z1, Zz2)ô(Zz1/Zz2) = 
F (z2, z2) (z1 /22). 


Question 5.1.3. (a) Given any formal power series F(z) € W[[z*!]], prove that 


e”E F(z) = F(z + w). 
(b) Prove (5.1.5). 
Question 5.1.4. (a) Let V be any VOA, and u, v € V. Then for any k € Z, prove that 
. 1 = 
(21 — 29) [Y (u, 21), Y w, 22) = >> 7 (2271 52/21) Y (Waray, z2). 
e>0 ©" 
(b) Let u E€ Vm, v E V, be homogeneous vectors in any vertex algebra V. Prove that 
Mu, v) =m +n works in vA4. 


Question 5.1.5. (a) Prove that in any vertex algebra, the vacuum 1 is translation-invariant, 
ie.T1=0. 

(b) In any VOA, verify that the span of L_, Lo, L is the Lie algebra sI,(C). Verify that 
the vacuum is invariant under it. 


Question 5.1.6. Prove that for any a, b, c in a vertex algebra V, every coefficient z” of 
:a(z) b(z): u involves a finite sum, and for all but finitely many negative n this sum is 0. 


5.2 Basic theory 


A VOA is a remarkably rich algebraic structure, with infinitely many heavily constrained 
products. In this section we continue to work out the easy consequences of the axioms. 


l But it wasn’t. Remarkably, the actual historical reason is that Y comes after X, and X was the name 
arbitrarily chosen in [201] for a pre-vertex operator. The symbol Y first appeared in their chapter 8; 
Borcherds used the symbol Q. 
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The deep role of the Virasoro algebra remains hidden in this section. We also associate 
VOAs with lattices and affine algebras. 


5.2.1 Basic definitions and properties 


For any u € VY, define o(u) = u(n—1). Then VAI tells us o(u) preserves each grade, that 
is it maps each homogeneous space Vm to itself. In particular, every space VY, has an 
algebraic structure defined by u x v = o(u) v. In the CFT literature, these are called the 
zero-mode algebras (because u(n—1) = U40}). 

Typically, the zero-mode algebras V„ are quite complicated. However, consider 
y,. Put £ = m = n =0 in (5.1.7b) and hit it with any w € V: we get u@ (v@w) — 
voluow) = (U@)V)o)wW. If we now formally write [xy] := xy, then this becomes 
[u[vw]] — [v[uw]] = [[uv]w], which is one of the forms of the Lie algebra Jacobi iden- 
tity (1.4.1b). Thus our bracket will be anti-associative if it is anti-commutative, in which 
case VY, will be a Lie algebra. But is it anti-commutative? From (5.1.9) we get 


1 . ; 
nv = —(-1 aes L) U(n+i 5.2.1 
Win) di! YE 1) menu) (5.2.1) 
SO uv = —vou (mod L_;V). However, from vAl, voa4 and Question 5.1.5, we get 


(L-1V)1 = L-10) = L-1(C1) = {0}. 


Thus, in any VOA, V; is a finite-dimensional Lie algebra. Each homogeneous space V, 
is a module for V4. 

Given any u,v € Vj, uav € Vo = C1, and so define (u|v) € C by (ulv) 1 = uqv. 
From (5.2.1), (ulv) = (vu), so (*|*) defines a symmetric bilinear form on Vı. We would 
like («|*) to respect the Lie algebra structure, that is be [««]-invariant. We compute from 
(5.1.7) and va2 


((uv]|t) 1 = —vo (lod) + ulivi = (ul [ve]) 1, (5.2.2) 


that is ({uv]|t) = (u|[vt]) and («|*) is indeed [«xx]-invariant. Of course, this bilinear 
form is identical with that of (5.1.12b), and so provided (5.1.11b) is satisfied, it will be 
nondegenerate. 

The existence of this bilinear form severely restricts the possibilities for the Lie algebra 
Y,. Such Lie algebras are called self-dual and are precisely those for which the Sugawara 
construction (3.2.15) works. They are studied, for instance, in [415], [189], [384] — 
see also example 2.1 in [156]. If we also demand that the VOA be weakly rational 
(Definition 5.3.2), then Vı will be reductive (i.e. a direct sum of simple and trivial Lie 
algebras) [156]. 

The affinisation Vi” of the Lie algebra V; also appears naturally in the VOA V. In 
particular, the modes u(n), for all u € V; and n € Z, have the commutators 


Um) © Vin) — Vin) © Um) = (LU, Vam4ny + mM (U|V)bm4n, 011). 


Thus these u(n), together with centre C11) and derivation L_;, span a V,"?-module. 
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More generally, in Section 7.2.2 we need to obtain a Lie algebra from a near-VOA V. 
As before, we obtain a Lie algebra structure on V/L_,V, and it has an invariant bilinear 
form if we restrict to V1 /L_, Vo. In the situations we will be interested in, this algebra is 
too large, but it can be reduced as follows. Define 


PV, := {u E€ Va| Lmu = 0 for all m > 0}, (5.2.3) 


i.e. the conformal primaries with conformal weight n. Then a straightforward calculation 
verifies that PV ;/(L_1Vo N PV 1) is itself a Lie algebra, with the usual bracket. Through 
the map u +> u), this Lie algebra acts on V and this action commutes with that of Lm. 
These associations of Lie algebras to (near-)VOAs are due to Borcherds [68]. 

By an automorphism (or symmetry) a of a VOA V we mean an invertible linear map 
a : V —> V obeying 


a(Y (u, z) v) = Y (œ (u), z) a(v), 


together with a(1) = 1 and a(w) = œw. This is how group theory arises in VOAs. The 
automorphism group can be finite (e.g. Aut(V") = M) or infinite (e.g. Aut(V(A)) = 
(R*)*41C og), but it can be finite only if Y; = 0 (Question 5.2.2). Conjecturally, at least 
when V is sufficiently nice, Aut(V) will be finite if (and only if) V; = 0. 

Similar arguments (Question 5.2.3) show that when V; = 0, V2 is a commutative non- 
associative algebra with productu x v := uqv € V and identity element 50. Moreover, 
an ‘associative’ bilinear form can be defined on V2 (Question 5.2.3). For example, the 
Moonshine module V” satisfies Vv} = 0 (Section 7.2.1), and vi is none other than the 
Griess algebra [263] extended by an identity element. 

The operators uo), u € V, are derivations (i.e. infinitesimal automorphisms) of V, 
that is 


[uco), Y (v, z)] = Y (uo (v), z), (5.2.4) 


and so exp(uo)) is an automorphism of V if it is defined. This is important to the BRST 
cohomology construction (Question 5.2.4), borrowed from string theory. 


5.2.2 Examples 


Unlike more classical algebraic structures, VOAs are notorious for having no easy exam- 
ples. In this section we construct families of them, in the most direct way possible. This 
explicitness has the drawback of making the constructions seem ad hoc. The reader inter- 
ested in seeing the naturality of these constructions should consult the more sophisticated 
treatments in, for example, [330], [376]. 

Recall from (3.2.12a) the oscillator algebra g = u”, with basis consisting of ay, 
n € Z, together with the central term C. For any nonzero level k € C, we get a ‘vacuum 
module’ V(g, k) defined to have basis consisting of the formal combinations 


a-m; `° a-m, 1 (5.2.5a) 
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for r > 0, where mı > m > --- > m, > 1. Using the actions C1 = k1, a,1 = 0 for 
n > 0, we see V(g, k) has a u -module structure. Of course u; embeds into V(g, k) by 
x € u goes to xal. 

We claim that V(u;“), k) has a VOA structure, for k 4 0. For the assignment of vertex 
operators, it suffices by (5.1.9) to define Y (x, z) for x € u: we get the ‘current’ 


Y(xa_)1,z) =x 5 age, (5.2.5b) 


neZ 


All other vertex operators follow from (5.1.9). For example, for m > 1, 
1 m—1 —n—1 
Y(a_ml1,z) = mD diane . 


The unique singular term in the OPE (5.1.6) of the basic current with itself is 


C 
Y (a11, z1) Y (a—ı1, z2) = Fae (5.2.5c) 
(z1 — 22) 
The Sugawara construction (3.2.14a) says here that the conformal vector is 
1 
w = apt a_\1, (5.2.5d) 


which makes V(g, k) into a (highly reducible) Wir-module with central charge c = 1. 
We also get the commutation relations 


[Lm, an] = —Nam+n- (5.2.5e) 


In particular, the grading, given as we know by Lo, assigns the basis vector (5.2.5a) 
conformal weight mı + ----+m,, so the current (5.2.5b) has conformal weight 1. 
There is an obvious generalisation to any abelian Lie algebra h = C with a choice 
of nondegenerate inner product on the space h (this defines the central term of the affine 
bracket (3.2.12a)). Namely, replace a with an orthonormal basis a',...,a% of C$; the 
basis of the VOA is built up from all the operators a‘_, as in (5.2.5a). These VOAs 


V(b, k) are often called Heisenberg VOAs, because p” is a Heisenberg algebra (i.e. 
a Lie algebra h with [h, h] equal to the centre of b). It turns out (Question 5.2.6) that 
the VOA v”, k) is independent of the choice of level k, provided k 4 0, and also 
the choice of inner product, provided it is nondegenerate. We will let V(C”) denote the 
Heisenberg VOA with level k = 1 and standard inner product on the abelian Lie algebra 
ee OW 

The generalisation to any affine algebra g = g” [68], [201], [202], [384] is also 
straightforward. To any level k € C, k 4 —h*“ (h“ the dual Coxeter number of g), we get 
a natural VOA structure V(g, k) on the Verma module M (kwo) associated with highest 
weight kwo, with central charge (3.2.9c). For example, from the Sugawara construction 
(3.2.15), the conformal vector is 


1 i i 
= BEER) ay bnh, (5.2.6) 


i 
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where a’, b’ € Gare bases for g, dual with respect to the Killing form on g: (a' |b’) = 4;;. 
Any pair of dual bases give the same w — the element 5 >.>, a'b’ in the universal enveloping 
algebra U (g) is simply the Casimir operator, and lies in the centre of U (g). The only 
important difference here from the Heisenberg VOA is that sometimes there are ‘null 
vectors’, that is the Verma module M (A) may not be irreducible. In fact, maximal numbers 
of null vectors is the signature of the most interesting levels, namely k € N. We should 
quotient out all null vectors: by V(g,k) we mean the VOA structure (5.2.6) on the 
irreducible g-module L(kwo) defined in Section 3.2.3. Most interesting (because of its 
representation theory — Section 5.3) is V(g, k) when k € N, what we will call integrable 
affine VOAs. 

The Lie algebra V; associated with these affine algebra VOAs V = V(g, k) is isomor- 
phic to the reductive Lie algebra g. Its affinisation, defined last subsection, equals g. 

The forbidden level k = —h” is called the critical level and is very interesting in its 
own way. The conformal structure is lost (the conformal vector (5.2.6) won’t exist), but 
the Möbius symmetry remains. The affine algebra vertex algebras at critical level have a 
highly nontrivial centre, and through it are related to geometric Langlands (see e.g. the 
discussion in section 17.4 of [197]). For this reason, it should be interesting to study it 
from the context of CFT. 

Another relatively simple class of VOAs are associated with lattices [68], [201]. 
The simplest possibility is an n-dimensional positive-definite lattice L (Section 1.2.1), 
all of whose inner products a -b are even integers. By C{L} we mean the (infinite- 
dimensional) group algebra of the additive group L, written using formal exponentials: 
for each vector v € L, we have a basis vector e” of C{L}, which multiply by e“e” = et”. 
Let h = C Q L Z C” be the underlying complex vector space of L, interpreted as an 
abelian Lie algebra. It inherits the inner product of L. The underlying vector space of the 
VOA V(L) is V(h) ® C{L}, where V(b) is the Heisenberg VOA constructed earlier. The 
vertex operator Y (h ® 1, z), for h € V), equals the vertex operator Y (h, z) in V(h). 
Less clear is how to define the vertex operators Y(1 & e“, z), but once we know how 
the affine algebra h acts on the group algebra C{L}, they will be heavily constrained 
by the OPEs (5.1.6a). Define At” .e% = (h|a) êm oe”, for any h € b and a € L, where 
we identify a € L with the corresponding vector in h = C ® L. Then the OPE (5.1.6a) 
tells us (as usual displaying only the singular terms) 


(hla) 


1 — 22 


h@)Y Ber, 22) = - YU 8e”, 22) +++: 


From this, and the pairwise locality of these vertex operators, we derive the formula 


=j =j 
¥(1 Be", z) =e exp (- oy Zw) exp (- De Za) P 
j<0 J j>0 J 
In the usual way, this determines all vertex operators Y (h @ e“, z). The vacuum is 1 x 1 
and conformal vector w is wœ ® 1; the central charge c though now equals the dimension n 


of L. The vectors h ® 1 forh € h have conformal weight 1, while 1 & e” have conformal 
weight (a|a)/2. 
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The construction is the same for any even positive-definite lattice L (i.e. all norm- 
squareds are even), except that the group algebra C{L} should be ‘twisted’ so that 
eve? = (—1)@') e8 e” Tf instead L is an odd positive-definite lattice (i.e. an integral lattice 
with some vectors of odd norm-square), the same construction yields vertex operator 
superalgebras (i.e. VOAs except the locality axiom vA4 can involve anti-commutators). 
For example, L = Z describes two fermions. 

Repeating this construction for an indefinite even lattice L will yield a near-VOA. To 
see this, note that the conformal weight of 1 © e“ is (œ |œ)/2. If we regard V(L) as graded 
by L rather than by Z, we obtain a grading into finite-dimensional subspaces. 

There are several ways to construct new VOAs from old ones. For example, one 
can take the direct sum of VOAs with equal central charge (this doesn’t change the 
central charge), or tensor products of arbitrary VOAs (the central charge adds) — see 
section 3.12 of [376]. The orbifold construction mods out by discrete symmetries: for a 
finite group G of symmetries of a VOA V, let V? denote the subspace of V fixed by G; 
then V° is a vertex operator subalgebra of V — see Sections 4.3.4 and 5.3.6. 

Finally, Goddard—Kent—Olive (GKO) coset construction [250] mods out by continuous 
symmetries. In particular, let (V, Y, 1, œ) and (’, Y, 1, w’) be VOAs with V’ C V. So 
V’ would be a vertex operator subalgebra of V except the conformal vectors need not 
be equal. Assume, however, that w’ € Vz and Liw = 0. The coset construction finds a 
VOA structure on the centraliser 


Cy(V’) := {v € VI E (v, z1), Y (u, z2)] = 0 Yu € V} 
= {ve V| vnu = 0 Vu € V',n € Z}. (5.2.7) 


The equality in (5.2.7) follows from Question 5.2.5. Then (Cy V’), Y, 1, œ — œ’) is a 
VOA with central charge c — c’. In the VOA language, this was developed in [202]; see 
also the lucid treatment in section 3.11 of [376]. 

A conjecture of Moore and Seiberg [436], [437] states that every RCFT arises from 
orbifold and coset constructions applied to lattice and affine algebra theories (generously 
enough interpreted). They speculate that this would be the analogue here of Tannaka— 
Krein duality (Section 1.6.2). We seem a long way from proving this optimistic guess, 
even in a more limited context of sufficiently nice VOAs. 

The most famous VOA is the Moonshine module V", constructed in 1984 in a tour 
de force by Frenkel-Lepowsky—Meurman [200]. It has central charge c = 24, with 
V! = Va ® y} p V% @® ---, where vè = C1 is one-dimensional, vi = {0} is trivial and 
vi = (Cw) @ (Griess algebra) is (1 + 196883)-dimensional. Its automorphism group is 
precisely the Monster M. Thus each graded piece V; is a finite-dimensional M-module. 
It has graded dimension J, and is the space (0.3.1) lying in the heart of Conway and 
Norton’s Monstrous Moonshine (see Sections 4.3.4 and 7.2.1). 

A formal parallel exists between integral lattices L and VOAs V [201], [248]. The 
dimension n of L corresponds to the central charge c of V. An even lattice corresponds 
to a VOA while an odd lattice corresponds to a vertex operator superalgebra. As we 
see in the next section, the determinant |L| relates to a measure of how many irre- 
ducible modules the VOA has. The norm-/2 vectors in L correspond to the vectors in 
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V; — indeed, the norm-v/2 vectors in a lattice L are special because they generate a Cox- 
eter subgroup in Aut(L); the vectors in V; are special because they generate a continuous 
subgroup (a Lie group) of Aut(V). In particular, the Leech lattice A and the Moonshine 
module V’ play analogous roles (Section 7.2.1). Analogies of these kinds are always 
useful in their easy role as squirrels. The battle-cry ‘Why invent when one can profitably 
copy?’ is heard not only in Hollywood. 


Question 5.2.1. Let V be a VOA, and let a finite group G act as automorphisms on 
VY, so each space V, is a (finite-dimensional) G-module. Prove that for each n, V, is a 
G-submodule of V,,.;. (Hint: Consider the map L-1.) 


Question 5.2.2. In any VOA, define a map e®” : Y —> Y for each v € Vj, and show that 
for v Æ 0 it defines a nontrivial automorphism of V. Verify that e”! generates a normal 
subgroup of Aut(V), and hence that Aut(V) will be uncountable if V; Æ 0. 


Question 5.2.3. Suppose a VOA VY has V; = 0. For u, v € V2, define u x v = u,v. Verify 
that V2 is commutative with this product, with identity w/2. Define a C-valued bilinear 
form on V, and discover how it is compatible with x. 


Question 5.2.4. Let V be a vertex algebra, and suppose u € Wy satisfies (u@)? = 0. Prove 
that V = keru)/im uq is itself a vertex algebra. 


Question 5.2.5. Prove that [Y (u, z1), Y (v, z2)] = 0 iffu,v = 0 for all n > 0. 


Question 5.2.6. (a) Suppose both V,V’ are complex n-dimensional vector spaces 
together with choices of nondegenerate inner-products. Verify that the Heisenberg VOAs 
VV), k) and VV", k’) are isomorphic as VOAs, provided only that k, k’ are both 
nonzero. 

(b) Let g = g” be the nontwisted affine algebra associated with a simple finite- 
dimensional Lie algebra g, and let k Æ k’ be two complex numbers, both distinct from 
the critical level —h’ . When are the affine algebra VOAs V(g, k) and V(g, k’) isomorphic 
as VOAs? 

(c) Let L, L’ be two positive-definite lattices, all of whose inner-products u - v are even 
integers. When are the lattice VOAs V(L) and V(L’) isomorphic as VOAs? 


Question 5.2.7. Find an even indefinite lattice L such that the near-VOA V(L) has finite- 
dimensional homogeneous spaces V(L),, for all n € Z. 


5.3 Representation theory: the algebraic meaning of Moonshine 


We know affine algebras have modules (namely the integrable ones) with interesting 
characters. However they have many other modules that are far less interesting, even 
if we restrict to highest weight ones with positive integer level. What general principle 
distinguishes the interesting ones from the generic? Of the uncountably many level k € N 
highest-weight X,.)-modules, the integrable ones are precisely those that are unitary. 
It is tempting then to guess that unitarity is the key principle. However, the reason to 
doubt its fundamental role is that there are RCFTs (e.g. the Yang—Lee model with central 
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charge c = —22/5, see section 7.4.1 of [131]) whose graded dimensions obey all of the 
properties the affine characters do, but whose modules are not unitary. 

The key feature possessed by the integrable affine modules is that they are unexpectedly 
small — that is, the null vectors in the associated Verma module, all of which are quotiented 
away, are maximally numerous. In other words, they are also modules of a sufficiently 
nice (‘rational’) VOA. The appearance of an affine algebra here is not directly significant, 
rather it is the appearance of that rational VOA. Modules of those VOAs may or may 
not be unitary. VOAs serve as the unifying mathematics underlying the modules singled 
out by Moonshine.” 

The raison ď être of VOAs are their modules, and in Moonshine we are primarily 
interested in their graded dimensions and characters. It is to this important topic — the 
algebraic meaning of Moonshine — that we finally turn. See also [199], [376]. 


5.3.1 Fundamentals 


A module of a VOA Y is a vector space on which Y acts, in such a way that this action 
preserves all possible structure. More precisely: 


Definition 5.3.1 [199] Let V be a VOA. A weak V-module (M, Yy) is an N-graded 
vector space M = ®nexnMjn, and a linear map Yy : V > End M[[z*!]], written 
Yuu, 2) = J pez Uz" 
from Mig into Mik+e—n-1)5 


, such that for any u € Vx, the mode uq) is a linear map 


Yul, z) = idm, (5.3.1a) 
-1s [721772 -1s (72771 
Zo ô a Yuu, 21) Yu(v, 22) — 29 ô <a Yu(v, 22) Yuu, 21) 
aĵ Z1 — Zo 
=7, (1 ) Yu u 200020, (5.3.1b) 
2 


where each mode un) operates on M. The Y y(u, z) are also called vertex operators. A 
weak V-module (M, Ym) is called a V-module if in addition it comes with a grading 
M = ®aec Mg, with Ma = 0 for Re(a) sufficiently negative, obeying 


Ma = {x € M | Lox = ax} (5.3.1c) 


(the eigenvalue a is again called the conformal weight of y € Ma), and all homogeneous 
spaces M, are finite-dimensional. 


We are interested in V-modules. For the VOAs of interest to us (see Definition 5.3.2), 
the conformal weights are always rational (hence the name). Definition 5.3.1 uses the 
Jacobi identity (5.1.7a) rather than the simpler locality vA4 because, although locality and 
the Jacobi identity are equivalent for VOAs, for modules the Jacobi identity is stronger 
(see chapter 4 of [376]). 


2 Victor Kac expresses a related position by isolating locality as the key principle [329]. 
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As before, the modes L, = @n+1) of the conformal vector w € VY yield on M a rep- 
resentation of the Virasoro algebra Yir, with the same central charge c as V. In analogy 
with (5.1.10b), the graded dimension of a V-module M is defined to be 


x(t) = trye Loe) — ge 5 dim Ma q“. (5.3.2) 
acC 
It is fundamental to the whole theory that these x are modular, at least for ‘nice’ V and 
M (see Theorem 5.3.8 below). The automorphism group of VY acts on each homogeneous 
space M, — that is, each Mg carries a representation of Aut(V), and so the q-coefficients 
of x(t) are dimensions of Aut(V)-representations (famous examples being (0.2.1)). 

It is straightforward [199], [376] to write down the definitions of V-module homomor- 
phism, direct sum of V-modules, submodule, irreducible module (no nontrivial submod- 
ule), completely reducible module (i.e. M can be written as a direct sum of irreducible 
Y-modules), etc. Invariant bilinear forms can be defined for modules as in (5.1.1 1a), and 
have analogous properties [199], [380]. 

The easiest example of a V-module, of course, is V itself, called the adjoint module. 
If V is irreducible as a V-module, it is called simple (see Definition 6.2.3). All VOAs of 
interest in this book are simple. An example of a nonsimple vertex algebra is the affine 
algebra vertex algebra at critical level k = —h’. 

The notion of tensor product — called fusion M [x] N — for VOA modules is unexpect- 
edly subtle. For example, the infinite-dimensional adjoint module V should have trivial 
fusions, just like the one-dimensional Lie algebra module C has trivial tensor products. 
See, for example, [298], [222], [382] for various approaches. Fusion products in a weakly 
rational VOA can be decomposed into irreducible modules as usual: 


M XIN = reom Nun P, (5.3.3) 


where the multiplicities Nj, are called fusion coefficients. These numbers are most 
easily defined (via Schur’s Lemma) as the dimension of the space of intertwiners [199] 
(Definition 6.1.9). For semi-simple Lie algebras, the tensor product of modules defines 
a symmetric monoidal category (Section 1.6.2); for nice VOAs, the fusion of modules 
defines a braided monoidal category and the structure constants Nj, a fusion ring 
(Section 6.2.2). 


Definition 5.3.2 [574] A VOA V is called weakly rational if every V-module is com- 
pletely reducible, V has only a finite number of irreducible modules, and every irreducible 
weak V-module is a V-module. 


Let (V) denote the set of irreducible V-modules. Most of our VOAs will be weakly 
rational. The term ‘weakly rational’ is not standard; rational is sometimes used. However, 
arational VOA should enjoy all properties of the chiral algebra of a RCFT, which is why 
we reserve the term ‘rational’ for the stronger notion presented in Definition 6.2.3. 


Lemma 5.3.3 [574] Let V be a weakly rational VOA, and let M be any irreducible 
Y-module. Then there is a number h € Q such that the homogeneous subspace Mp is 
nonzero, and such that if My + 0 for some a € C, then a — h € N. 
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The proof isn’t difficult — see page 244 of [574] for a more general argument. We call 
h = h(M) the conformal weight of M, and the space M, = Mio the lowest-weight space 
of M. For example, the conformal weight h(V) of the adjoint module is 0. The lowest- 
weight space M, generates the whole module, in the sense (5.1.9a) that M is spanned 
by vectors of the form (W1)(n,) °** Uk) y for u; € V and y € M}. The lemma implies 
that for such a module M, we have xyy(t + 1) = e771") y y(t) as formal power series. 

In both finite group theory and Lie theory, given any module M, a module structure 
can also be found on the vector space dual M* of M in a straightforward way. This 
module is called the dual or contragredient of M. Something similar happens for VOAs. 
However, the naive dual of an infinite-dimensional space tends to be too large (recall 
that in infinite dimensions, the double-dual (V*)* properly contains V), so here we take 
instead the restricted dual M* of M, defined by 


M* = ®.(M,)*. (5.3.4a) 


The explicit V-module structure on M* (see section 5.2 of [199]) is quite complicated 
and closely related to the definition of invariant bilinear form in (5.1.1 1a). Note that 


xm (T) = Xu(T) (5.3.4b) 


even though M* and M are usually non-isomorphic as V-modules. Thus our graded 
dimensions (5.3.2) won’t always distinguish modules, something that was independently 
observed in the context of Monstrous Moonshine, as we’ll see. We return to this bother- 
some but not unexpected fact in Section 5.3.3. The more obscure term ‘contragredient’ 
is usually used for M*, as ‘dual’ has too many unfortunately independent meanings. 
The notion of contragredient module plays a large role in RCFT: roughly, M* is the 
anti-particle of M, and they are related by charge-conjugation C. 

All VOAs Y of interest to us have an anti-linear involution u +> u* such that the 
invariant bilinear form (u|v) of (5.1.1 1a) satisfies 


(ulv*) = Ou, Wve. (5.3.5a) 


The notion of unitary module M is important in physics: it is a V-module in which the 
bilinear form on M satisfies 


(uxly)u = (xlu*y)m, Yu € V, x,y E€ M. (5.3.5b) 


Consider first the lattice VOA V(L) constructed in Section 5.2.2, where L is a positive- 
definite even lattice (recall the definitions in Section 1.2.1). It is weakly rational, and its 
irreducible modules are parametrised naturally by the cosets L*/L, where L* D L is the 
dual lattice to L [144]. The explicit construction of these modules M [t], for [t] € L*/L, 
is very similar to that of the VOA V; itself — see section 6.5 of [376]. Thus the number 
I(V(L))|| of its irreducible modules is given by the determinant |L| of the lattice. 
The adjoint module is M[0]. The module M[r] has contragredient M [—t] and graded 
dimension 


OL (T) 
n(t)" 


Xm (T) = ; (5.3.6) 
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where n is the dimension of L, 7 is the Dedekind eta function (2.2.6b) and ©,+z is the 
theta series of (2.2.11a). The fusion product here is M[t] KI M[t’] = M[t +t]. 

The Heisenberg VOAs are not weakly rational. For example, V(C) has a distinct 
irreducible module M (A) (namely the Verma module V (A) of (3.2.12b)) for every 4 € C. 
The adjoint module is M (0), and the contragredient of M (à) is M(—A). Only the modules 
with A € R are unitary. The graded dimension of M (A) is given in (3.2.12c). 

However, if g is a simple Lie algebra and g = g™” is the associated nontwisted affine 
algebra, then the VOA V(g, k) will be weakly rational iff the level k lies in N. Just as the 
VOA V(g, k) is the g-module L(kwpo) with additional structure, the irreducible V(g, k)- 
modules can be identified with the g-modules L(A), for level-k integrable highest weights 
rA€P. £ (g) [202]. In particular, the VOA graded dimension will equal the corresponding 
specialised affine algebra characters x,(27itlo) = x,(t, 0, 0) of (3.2.11c). The usual 
tensor product L(A) ® L (u) of affine algebra modules is less interesting than the fusion 
product L(A) XI L(y) — in the former, levels add and the tensor product coefficients Th 
can be infinite, while the latter is studied in Section 6.2.1. 

A weakly rational VOA is called holomorphic if it has a unique irreducible module. 
As usual this terminology comes from RCFT: a holomorphic VOA can be the left- 
moving chiral algebra of a CFT with trivial right-moving chiral algebra, so the physical 
correlation functions (4.3.1a) of such a CFT would be holomorphic (at least locally, 
when all insertion points z; are distinct). Thus the lattice VOA V(L) is holomorphic iff 
the lattice L is self-dual. The most famous example of a holomorphic VOA though is the 
Moonshine module V” [145]. In fact, its holomorphicity is one of the keys to Monstrous 
Moonshine (see Question 5.3.4). 


5.3.2 Zhu’s algebra 


In many ways a VOA resembles a Lie algebra, and this analogy has often been exploited 
to flesh out the theory of VOAs. However, the representation theory of the weakly rational 
VOAs resembles that of a finite group. 

Consider for concreteness the symmetric group G = S3. Its representation theory is 
captured by its group algebra CG (Section 1.1.3), that is the formal span of the ele- 
ments øo € G = {(1), (12), (23), (13), (123), (132)}, where G acts by left multiplication. 
The associative algebra CG is semi-simple, and so is a direct sum of matrix algebras: 
here, 


CG = Mix1 ($>) Mixı ($>) Mox2, (5.3.7a) 


where the first summand M,,.; contains one copy of the trivial one-dimensional irre- 
ducible representation p;(0) = 1, the second summand Mıxı contains one copy of 
the ‘sign’ one-dimensional irreducible representation p,(0) = (—1)°, and the four- 
dimensional algebra M2x2 contains a continuum of copies of the two-dimensional irre- 
ducible representation p2. More precisely, the three subspaces of the group algebra CG 
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specified by (5.3.7a) are 


Vi = C{(1) + (12) + (23) + (13) + (123) + (132)} = p, (5.3.7b) 
V, = C{(1) — (12) — (23) — (13) + (123) + (132)} = ps, (5.3.7c) 
V2 = C{(1) — (123), (1) — (132), (12) — (23), (12) — (13)} = p2 ® p2.  (5.3.7d) 


Incidentally, the different copies of the irreducible module p2 in the subspace V2 are 
parametrised by the projective line P!(R) = S!: choosing a nonzero point x in 


C{(1) — (12) + (23) — (132), (23) — (13) + (123) — (132)}, (5.3.7e) 


and hitting with arbitrary o € G, spans a copy V2(x) of the two-dimensional module p2, 
and V(x) N V2(x') = {0} unless x and x’ are complex multiples of each other, in which 
case V>(x) and V2(x’) are equal as sets. On the other hand, choosing a generic element 
of V2 (respectively CG) will span all of V2 (respectively CG). 

The representation theory of a finite group G is equivalent to that of the associative 
algebra CG. Likewise, for semi-simple Lie algebras g there is also an associative algebra, 
generated by g, which classifies all irreducible g-modules: the universal enveloping 
algebra U (g) (Section 1.5.3). However, it is infinite-dimensional, reflecting the fact that 
g has infinitely many inequivalent irreducible modules. 

Remarkably, weakly rational VOAs VY have (like finite G), a finite-dimensional asso- 
ciative semi-simple algebra, denoted A(V), which classifies the finitely many irreducible 
Y-modules. As we know, the full module M can be generated from its lowest-weight 
space M}, by repeatedly acting by modes of V, and so it suffices to study M;,. Now, the 
zero-modes o(u), defined at the beginning of Section 5.2.1, act on each homogeneous 
space M,; Zhu’s algebra A(V) is the algebra of zero-modes, as seen by the lowest-weight 
spaces M. A more formal construction, which will begin next paragraph, is due to Zhu 
[574], although it was anticipated in physics [429], [87]. Similar to the above, each 
irreducible V-module M corresponds to a linear functional fy on V (Section 4.4.4); a 
certain large subspace O(V) of V lies in the kernel of all functionals fm o o(v) Vu € V, 
so each of these defines a well-defined functional on the quotient A(V) := V/O(V). The 
quotient A(V) has a product u x v making it into an associative algebra; the space of 
functionals fy o o(v) carries a module action of A(V), and as such can be identified with 
the dual M% of the lowest-weight space of M . Conversely, any (irreducible) right-module 
for A(V) is the lowest-weight space of an (irreducible) V-module M. This physically 
motivated treatment of Zhu’s algebra is fleshed out in [227]. 

Zhu’s treatment is similar. For u, v € V, where u € Vz, define a product 


1 k 
u x v = Res, (raa ) ) ; (5.3.8a) 
Z 
or equivalently, in terms of the modes, 
(u * V)n) = 2 U(—1—m) © Vantn) F 5 Um+n) © U(—-1-m): (5.3.8b) 


m>k m<k—1 
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Extend x linearly to all u € V. Let O(V) be the subspace of V spanned by elements 
(Liu + Lou) * v, Wu,ve V. (5.3.8c) 


By Zhu’s algebra A(V) we mean the quotient V/O(V). 
The point of these definitions is that, on the lowest-weight space M, of any irreducible 
Y-module M, a straightforward calculation (see page 250 of [574]) verifies that 


o(u x v) = o(u) o o(v). (5.3.9a) 
Using (5.1.8b), (5.1.7d) and vA2, we see that 
o(L_ju + Lou) = 0 (5.3.9b) 


identically on V. Together, (5.3.9) tell us o(u) = 0 on each lowest-weight space Mp, 
for any u € O(V). Thus for any class [u] € A(V), the zero-mode o(u) is a well-defined 
operator on each M}. 


Theorem 5.3.4 [574] Let V be a weakly rational VOA (recall Definition 5.3.2) and let 
A(V) = V/O(V) be Zhu’s algebra. Then A(V) is a finite-dimensional, associative and 
semi-simple algebra, isomorphic as an algebra to the matrix algebra 


AV) = Bucov) Mnm)xnMm)» 


where ®(V) is the set of all irreducible V-modules, and n(M) is the dimension of the 
lowest-weight space Mı. 


In other words, there is a one-to-one correspondence between the irreducible modules 
of A(V) and V; the irreducible A(V)-modules can in fact be naturally identified with 
the lowest-weight spaces M, of the irreducible V-modules. It is almost identical to 
what happens with the group algebra of a finite group. Note that the dimension n(M) is 
the coefficient of the first nontrivial term n(M Jq’ = 24 of the graded dimension xy. 
The hard part of the proof of Theorem 5.3.4 is establishing that an irreducible A(V)- 
module lifts to an irreducible V-module (the basic idea is sketched above). Incidentally, 
there are non-weakly rational VOAs (coming from ‘logarithmic’ CFTs) with Zhu’s 
algebra A(V) finite-dimensional but not semi-simple. 

For example, Zhu’s algebra A(V *) for the Moonshine module V’ is one-dimensional, 
while the integrable affine VOA V(g, k) at level k € N has Zhu’s algebra 


AVC, k)) = ®xert(g)Maimt@xdimL@> 


where L(A) is a highest-weight g-module (to get à, drop Ao from A). In general though, 
it is hard to compute A(V) (unless the V-modules are already known!) because we lose 
the grading — expressions like L_,;u + Lou are not homogeneous. 

The definition (5.3.8a) of the product ‘*’ in Zhu’s algebra can be modified to give the 
more familiar ‘normal-ordered product’ (recall (5.1.6)) 


u-v = Res,(¥ (u, z) vz!) = uv (5.3.10a) 
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foru € Vy, or equivalently in terms of modes 


(u - v) = È Ucim) © Yemen + È Yemen) © UR 1-m)- (5.3.10b) 
m>0 m<—1 
Let O2(V) be the span of all elements of the form u(_2)v, and A2(V) the quotient V/O2(V). 
Then A2(V) is a graded commutative associative algebra with product ‘-’. It also has a Lie 
algebra structure, with bracket given by [uv] = uv; together, the Lie and associative 
products define a commutative Poisson algebra. Its main role in VOA theory is in a 
finiteness condition: 


Definition 5.3.5 [574] A VOA V is said to be C2-cofinite if the A2(V) = V/O2(V) is 
finite-dimensional. 


Most of the important weakly rational VOAs (e.g. the Moonshine module, the lattice 
VOAs, the affine algebra VOAs at positive integer level) satisfy this condition. The term 
‘C>-cofinite’ comes from Zhu’s name for what we call O2(V). It has several conse- 
quences. Most importantly, the graded dimensions x y(t) of a C2-cofinite VOA con- 
verge to functions holomorphic in the upper half-plane H (theorem 4.4.2 of [574]). A 
C»-cofinite VOA will have well-defined finite fusion coefficients (5.3.3) (see theorem 13 
in [229]). 

It is conjectured that a VOA is weakly rational if and only if it is C2-cofinite, but 
although this would significantly simplify the definition of weakly rational, it seems 
difficult to prove. Weakly rational VOAs satisfy dim A2(V) > dim A(V) (generalised 
in lemma 3 of [229]), but inequality can occur — for example, the integrable affine 
algebra VOA V(E3, 1) has a one-dimensional Zhu’s algebra but A(V) is at least 
249-dimensional [224]. 

A C»-cofinite VOA is finitely generated in the sense that there will be finitely many 
vectors u!, ..., u” € V (namely, choose u’ to be the lifts to V of a basis of A2(V)) such 
that V is spanned by all vectors of the form 


umy UE mpl (5.3.11a) 


where mı > --- > mz > 0 [229]. Something similar (but weaker) holds for V-modules. 
Using this we quickly obtain a growth estimate: given any C2-cofinite VOA V, there 
is a constant C > 0 such that, for any irreducible V-module M, the dimension of the 
homogeneous space M, is bounded above by 


dim My < Cuye Ye", (5.3.11b) 


for some constant Cm, where as always h = h(M) is the conformal weight of M. The 
constant C depends only on dim A2(V), while C y is essentially dim M}, adjusted slightly 
to ensure (5.3.11b) also holds for small œ. 

Various interesting generalisations of Zhu’s algebras have appeared in the literature 
[149], [150], [229], [410]. From our point of view, these algebras play a crucial technical 
role in the statement and proof of the modularity of VOA characters. 
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5.3.3 The characters of VOAs 


The next four subsections mark a climax for the book, as we discuss the modularity of 
the graded dimensions (5.1.10b), (5.3.2). We also explain why this was anticipated by 
physicists. But first let’s reflect on the notion of character. 

Calling the quantities xy(t) and x(t) ‘characters’, as is common in the literature, is 
a misnomer — they are merely graded dimensions. Defining characters for an algebraic 
object is as much art as science. The beautiful success of the character theory of semi- 
simple and Borcherds—Kac—Moody Lie algebras hides the nontrivial intuition that went 
into the original definitions. Presumably the starting point was that the characters of 
finite groups are given by the trace. Also, exponentiation associates a Lie group with 
a Lie algebra. Putting this together leads to the character of (1.5.9a). The characters of 
(Borcherds—)Kac—Moody algebras then follow by analogy. Unfortunately, the situation 
for VOAs isn’t nearly as clear. 

The main properties we may hope a character xy to obey are: it specialises to 
dimension (or graded dimension); it distinguishes inequivalent modules; and it respects 
direct sum and tensor product (fusion for us), in the sense that Xmen = Xm + Xn and 


Xu ln = Xu xn- We would also expect the VOA characters in the special case of the 
integrable affine VOA V(g, k) to equal the corresponding affine algebra characters x, in 
(3.2.9a) (recall that the V(g, &)-modules can be identified with the integrable g-modules). 

This wish-list is hopelessly optimistic for even the nicest VOAs. The graded dimen- 
sions xma)(T) for the integrable affine VOA V(g, k) will not respect the fusion product: 


Xue) Kl my) F XLM@LWy(T) = XLW) XLT) = XMT) Xu): 


where L(A) @ L(t) denotes the tensor product of g-modules. On the other hand, fusion 
respects the asymptotic dimensions: for all sufficiently nice VOAs V, the limit 


Xu(T) 
xv(t)’ 
called the quantum dimension of M € ®(V), satisfies D(M XI N) = D(M) DIN). 
‘Sufficiently nice’ here means any C2-cofinite weakly rational VOA V obeying the addi- 
tional very common property that of all irreducible V-modules M € ®(V), a unique one 
realises the smallest conformal weight minyeoyh(M) (in the most familiar examples 
the unique minimal conformal weight belongs to the adjoint module M = V). 

Recall from (5.3.4b) that the graded dimensions x y(t) of inequivalent V-modules can 
be equal. A further example occurs whenever an even positive-definite lattice L has an 
automorphism g; then any pair M[t], M [æt] of V(L)-modules will have identical graded 
dimension. However, such equalities need not always have an easy algebraic explanation: 
for example, in Monstrous Moonshine two McKay—Thompson series (namely, 7774(T) = 
T27p(T), corresponding to unrelated elements of order 27) accidentally coincide for no 


D(M) = lim,_,o 


(5.3.12) 


obvious reason. None of this is surprising, since dimensions certainly don’t uniquely 
specify Lie algebra or finite group modules. 

We certainly would like VOA characters to distinguish inequivalent V-modules, and 
in fact be linearly independent. How to do this is clear from the study of lattice theta 
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functions or affine algebra characters: in order to retain more information of the homo- 
geneous spaces M, than merely their dimensions, we must include more variables in 


XM.: 


Definition 5.3.6 The character of a V-module M is the one-point function x y(t, v) 


[o0] 
xm (T, v) = truol) qg = q7 N trm p 00. (5.3.13) 

n=0 
h = h(M) is the conformal weight of M, and o(v) is the zero-mode (Section 5.2.1) of 
v € V, which is an endomorphism on each homogeneous space Mp+n (so its trace can 
be computed by choosing bases and writing o(v) as a matrix for each n). This function 
Xm arises naturally in CFT, as the one-point chiral block (Section 4.3.2) on the torus. 
We explain shortly why it is associated with a torus — this is the source of its modularity. 

Note that x(t, 1) equals the graded dimension x(t). By definition, the dependence 
of xy(t, v) on v € VY is linear. Provided VY is C>-cofinite, theorem 4.4.1 of [574] tells us 
that, for each v € V, xy(t, v) is holomorphic for t € H. This is proved by finding and 
studying a differential equation satisfied by x(t, v). Their modularity is established in 
Section 5.3.5. 

When VY is weakly rational and C2-cofinite, the one-point functions are linearly inde- 
pendent and thus distinguish inequivalent V-modules. In fact, we see from the proof of 
theorem 5.3.1 in [574] that if V4 is any lift from Zhu’s algebra A(V) to V, then the one- 
point functions x(t, v) will remain linearly independent even if v is restricted to the 
finite-dimensional subspace V4. For example, the graded dimensions x y(t) and xy+(T) 
are equal, but for v € V, the one-point functions obey 


xue(t, v) = (—1)" xm (T, v). 


Although one-point functions (5.3.13) don’t directly respect the fusion product (but 
recall (5.3.12)), they deserve the title ‘character’ as they are the simplest linearly inde- 
pendent extension of graded dimension. However, since they depend linearly and not 
exponentially on v, how can we reconcile them with the Jacobi theta functions (2.3.7) 
and the affine algebra characters (3.2.9a)? Mindlessly defining a function 


exp[2riw] try exp[2rio(v)] gh (5.3.14) 


for v € V and w € C will lose modularity. 
The key is to realise that, although the exponential q = e is topological in origin, 
the exponential e?7 in (2.3.7) and (3.2.9a) is Lie theoretic in origin. In particular: 


2nit 


Definition 5.3.7 Let V be a weakly rational C2-cofinite VOA. For any V-module M € 
(V), define the Jacobi character to be the quantity x}, (T, v, w) given by (5.3.14), except 
we restrict v to the Lie algebra Vi. 


Of course v = 0 and w = 0 recovers the graded dimensions. As we know, e°™® is an 
automorphism of M for v € Vj, and as we recall from the McKay—Thompson series 
the graded trace of automorphisms is worthy of study. Question 5.3.5 asks the reader to 
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verify that x}, recovers affine algebra characters. Of course the complex variable ‘w’ is 
merely included for book-keeping. We return to Jacobi characters in Theorem 5.3.9. 

If we hadn’t restricted v in Definition 5.3.7 to Vı, then linear independence would 
have been assured by that of the one-point functions x(t, v) (why?). In the familiar 
examples (e.g. lattice or affine algebra VOAs) we still have linear independence of the 
Jacobi characters, but it won’t hold for all other VOAs. 


5.3.4 Braided #5: the physics of modularity 


Let’s turn next to one of the central questions in the book: why should the VOA characters 
Xm have anything to do with modularity? In short, it is because they are toroidal chiral 
blocks of RCFT, and the mapping class group T1,ı (which must act on those chiral 
blocks) is SL2(Z). While filling in this explanation we’ll finally explain the shift ‘c/24’ 
appearing in the definition of the affine algebra characters and more generally the VOA 
characters xy. 

Lurking in the background of the following argument is the closed string, with period-1 
arc-parameter o and time-parameter t (recall Section 4.3.2). For the left-moving (holo- 
morphic) sector it is convenient to introduce complex parameters o — it and e?771(@-'), 
which we now call z and w, respectively. 

From the perspective of VOAs and CFT, the easiest way to realise the torus C/(Z + Zt) 
fort € H, starting with the space C, is by first considering the map z +> e?7¥ (the ‘277i’ is 
merely a convenient normalisation). This is a holomorphic map sending neighbourhoods 
of 0 to neighbourhoods of 1. It changes the global topology, however, sending the plane 
C to the annulus C\{0}. Now it is simple to obtain our torus: we simply identify z and qz, 
where as always q = e?™'" , This is equivalent to taking the finite annulus {z € C | |g| < 
|z| < 1} and sewing together its two boundary circles by identifying z on the outer 
circle with qz on the inner. The resulting torus is conformally equivalent to C/(Z + Zr) 
(why?). The point is that the chiral blocks on the torus can be obtained from those of the 
plane, through this construction of the torus from C. Let us now give the details. 

Let V be any VOA. For any coordinate transformation z +> w = f(z) sending 0 to 
0, and holomorphic in a neighbourhood of 0, the Virasoro algebra lets us calculate its 
effect on any vertex operator: we can write 


Y(v, z) Ty oY(v,z)0T;" (5.3.15a) 


for some invertible linear map Ty : V — VY (see [223], [295] for the explicit and general 
calculation). More precisely, there are a; € C such that (see proposition 2.1.1 in [295]) 


f(z) = exp bp anz”™! z z (5.3.15b) 


Z 


as formal power series, where ‘exp’ is defined by its Taylor series. Then we obtain 


[o0] 
Tv = exp > 7 v (5.3.15c) 


n=0 
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(regularity VA5 implies this map Ty : V — V is always defined). When v is a conformal 
primary of conformal weight k (recall (5.2.3)), the transformation is particularly nice: 


Tp o¥(u,z)0T;! =Y (v, w)(f’(2))*. (5.3.15d) 
The other important special case is the stress—energy tensor T (z) = Y (a, z): 


Ty 0¥(w,2)0T;! =Y (@, w) (f'E)? + 5 (f(z), z}, (5.3.15e) 


where { f, z} is the Schwarzian derivative 

Paa (fe) 
f'@) KOVAN 
The factor ‘c/12’ in (5.3.15e) is the same as in (3.1.5a). The Schwarzian derivative 
vanishes if and only if f is a Möbius transformation (i.e. if and only if f conformally 
maps the Riemann sphere to itself), and so is a measure of how f changes the global 
topology. 

Provided f(z) is holomorphic near 0 and obeys f(0) = 0, a second VOA structure 
can be defined on the vector space V as follows. The vertex operators are Y¢(v, z) = 
Y(Tyv, f(z)), the vacuum is 1; = T;(1) = 1, and conformal vector is wr = Tz (w). 
Let Vy denote this second VOA. Then V and Vy are isomorphic. (See [293] for a 
generalisation dropping the f (0) = 0 condition.) 

We are interested in the transformation w = f(z) = e 
plifies and we get 


{f z} := 


(5.3.15f) 


2xiz L 1, Then everything sim- 


wr = 4r’ (@ — c/24), (5.3.16a) 
Yw, z) =Y (w, we, wey. (5.3.16b) 
Although VY is a VOA isomorphic to Y sharing the same underlying space, modes and 


conformal weights are quite different. We will use square brackets to indicate the modes 
of Vy, and denote its Virasoro generators by L[n] = (wf )tn+1). We find for instance that 


L[-1] = 2ri (Lı + Lo), (5.3.16c) 
(— 1)"- 1 
L[0] = Lo + FD T Ly. (5.3.16d) 


Although by the isomorphism of V and V+ the homogeneous spaces V, and Vin] must be 
equal dimension, and in fact carry isomorphic representations of Aut V, we only have 
Vn = Vin forn = 0 or if dim V, = 0. On the other hand, if v € V is a conformal primary 
of conformal weight k with respect to the operators L,,, then it will be one with respect 
to the operators L[n] as well (see Question 5.3.2). 

For a technical reason, we are also interested in the simple relation between the usual 
power series modes L[n] of V+, and the Fourier modes L’, of V, defined by 


[o0] 
T(z) = Y(o, z) = — a Lg, 


m=— 00 
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We get (recall Question 3.1.8) 
c 
Iy 

The occurrences of ‘—c/24’ in, for example, the characters of affine algebras and 
VOAs can be traced back to its occurrence in (5.3.16a). Mathematically, it is a symptom 
of the change of global topology, from the plane to an annulus. Physically this is inter- 
preted as the Casimir energy of the cylinder [3]; see also the discussion in section 5.4 
of [131]. 

Our map f mapped the plane to the annulus C\{—1}. To get the torus, we need to 
identify z on the outer circle e? — 1, with the point g (z + 1) — 1 = qet? — 1 onthe inner 
circle. By the axioms of CFT (e.g. Section 4.4.1), this identification (‘sewing’) corre- 
sponds to taking a trace. For simplicity consider first the vacuum-to-vacuum amplitude 
(‘partition function’) on this torus, and write tT = s + it. The desired trace will be over 
the full space of states H, and will be of the ‘propagator’ for the cylinder, which takes 
the string and evolves it 27t ahead in time and twists it 277s arcwise. The infinitesimal 
generator of twists is the corresponding momentum operator, call it P , and the infinites- 
imal generator for time evolution is the Hamiltonian H, both in the z-coordinate frame. 
Thus the partition function will be 


Li, = Lin] — ôn,o 


Z(t) = try exp[2risP — 2xtH]. 


To find, for example, the Hamiltonian, note that changing time by ôt changes the w- 
coordinate by the factor e7?7®! 
calculation in Section 4.3.2); similarly, the momentum operator generates rotations in 
w. We obtain 


, so the Hamiltonian generates dilations in w (recall the 


= = c Cc 
P =L} — Lo = L[0] — L[O ; 
o~ Lo [0] — L[0] TEET 

= — C Ç 

o + Lo [0] + L[0] yA DA? 


where we use bars to denote the anti-holomorphic quantities. Thus we obtain the familiar 
expression for the partition function: 


Lo—c/24 —Lo—T/24 
» A 


L[0]—c/24 guia = tng q 


Z(t) = tryqg 
where the final equality follows from the isomorphism of VOAs V and V;. CFT or 
string theory requires that Z(t) be a function only of the conformal equivalence class of 
the torus C/(Z + Zr) — in other words, Z(t) must be invariant under the action of the 
modular group SL2(Z). 

We are more interested here in the associated chiral quantities, since a VOA is the chiral 
algebra of the theory. From the previous paragraph, together with the decomposition 
(4.3.6) of H into modules of V @ V’, we can now read off the decomposition of Z(t) 
into chiral blocks (see (4.3.8b)) in a RCFT. Hence the chiral blocks for the torus are 


tryq 
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— that is, they are simply the graded dimensions of the irreducible V-modules, includ- 
ing the strange shift by c/24. RCFT requires that this space must carry a projective 
representation of the mapping class group of the torus SL(Z). 

By the same reasoning, we can calculate the n-point chiral blocks on the torus. For 
L[0]-homogeneous vectors u’ € Vig], they are simply 
2miziky . amit Yu (ul, er) wae Vu", e?r n) gr, (5.3.17a) 


e DE 


where u’ € V are the inserted states and z; € C are the points of insertion. As usual, 
the definition for nonhomogeneous vectors follows by linearity. By construction these 
functions automatically have period 1 in each z;, and it is an easy calculation to verify 
that they also have period t in each z;, and thus the insertion points z; lie on the torus 
C/(Z + Zt), as they should. In particular, the reader can verify that the one-point chiral 
blocks are indeed what we call the one-point functions: for u € Vig], 

Lo—c/24 


e2tizk 2miz 


truYutu, e" )q = Xm(T, u), (5.3.17b) 


hence the name of the latter. By the general principles of RCFT, the space of say one-point 
chiral blocks should carry a projective representation of the mapping class group of the 
once-punctured torus, i.e. SL2(Z) (recall (4.3.9)), called modular data (Section 6.1.2). 
In Section 7.2.4 we find that a much larger group acts naturally on these one-point 
functions. 

In (5.3.17) we inserted states u' from only the vacuum sector. More generally, however, 
the states u’ can come from any sector, that is be vectors in any module M € ®(V). In 
that case the vertex operators Y y should be replaced by intertwiners Y (Definition 6.1.9). 
Although this generalisation is fundamental to VOAs and RCFT, it is less so for Mon- 
strous Moonshine (since V” is holomorphic). 

The point of this subsection is to see in some detail how physics (RCFT) anticipates 
the statement and proof of Zhu’s Theorem, to which we now turn. 


5.3.5 The modularity of VOA characters 

The most important property of the one-point functions is their modularity: 
Theorem 5.3.8 (Zhu [574]) Suppose V is a C>-cofinite weakly rational VOA (see 
Definitions 5.3.2 and 5.3.5), and let ®(V) be the finite set of irreducible V-modules. 
Then there is a representation p of SL>(Z) by complex matrices p(A) indexed by V- 
modules M, N € ®(V), such that the one-point functions (5.3.13) obey 

at+b b 

x ( s») =(ct+d)" Ý p (i ) xw(t, v) (5.3.18a) 
ct +d wea) \© 4/ mw 


for any v € V obeying L[0] v = nv for some n € N (see (5.3.16d)). 


In particular, the graded dimensions (5.3.2) obey 


at+b a b a b 
x (2?) - > ($ 1), 00 v(¢ i) € Sl). (5.3.18b) 


Ne®(V) 
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In (5.3.18), the quantity ‘c’ is an entry of a matrix in SL2(Z) and should not be confused 
with the central charge. As we saw last subsection, L[0] plays the role of Lo ina Virasoro 
representation obtained from L,, by a change-of-variables z: V = ®,Vjnj, where n € N 
and Vin] is the eigenspace of L[0] with eigenvalue n. We can summarise (5.3.18a) by 
saying that xm (T, v) is a vector-valued modular form of weight n and multiplier p (recall 
Definition 2.2.2). We will summarise the proof of Theorem 5.3.8 shortly; see [442] for 
an independent argument. 

One-point functions for the Moonshine module V = V" are studied in [155], where we 
find that all meromorphic modular forms for SL2(Z) appear as some xy:(T, v), provided 
the obvious constraints (namely that they be holomorphic in H, have zero constant term 
in their g-expansion and have at worst a simple pole at q = 0) are satisfied — clearly, 
if the coefficient of q” in xm(T) is zero then it must vanish in all other x(t, v). Thus 
although we see the Monster in the graded dimension of V^, we won’t see it in most 
one-point functions of V*. 

However, if v € V is fixed by some subgroup G, of the automorphism group of V, then 
the q“ coefficient of x(t, v) relates to the representations of G, and the eigenvalues of 
o(v)|m, (see Question 5.3.3). Note that in each homogeneous space V, 4 0 there will be 
nonzero vectors invariant under the full automorphism group of V (why?). For example, 
we read off from Table 7.3 that in the homogeneous spaces (V*),, of the Moonshine 
module for 0 < n < 7, the M-invariant subspace has dimension 1, 0, 1, 1, 2, 2, 4, 4, 7, 


respectively. 
The representation p in Zhu’s Theorem is called modular data (Section 6.1.2). The 
1 -1 
diagonal matrix p 01 ) is given in (4.3.10). The matrix $ = p (; 0 ) relates to 


the fusion multiplicities M; A n Via Verlinde’s formula (6.1.1b) (at least for nice VOAs — 
see Section 6.2.2). It is conjectured that, for sufficiently nice VOAs, the representation o 
should be trivial on a congruence subgroup I (N) (see the Congruence Property 6.1.7). 
When this is true, each graded dimension xy(t) will be a modular function for that 
rN). 

If we weaken the hypothesis of weak rationality or C2-cofiniteness (recall that these 
are conjectured to be equivalent) in Zhu’s Theorem, then we can still recover some kind 
of modularity. In particular, physicists speak of quasi-rational CFTs, which are CFTs 
with finite fusions; in examples it seems that they still obey some weakened form of 
Zhu’s Theorem (see Section 6.2.2). 

Note that Zhu’s Theorem is already strong enough to imply that the Moonshine module 
V” must have graded dimension J (t). To see this, note that holomorphicity implies that 
p(A) is a one-dimensional representation of SL2(Z). However, p ( ; ) must be trivial 
and thus 


xv(A.t) = xy:(t), VA € SL(Z). 


We know xy:(t) must be holomorphic in H (all graded dimensions are), has constant 
term 0 and a simple pole at the cusp. Therefore it equals J (t). See also Question 5.3.4. 
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The proof of the Hauptmodul property for the other McKay—Thompson series T, is much 
more subtle, unfortunately. 

Zhu’s Theorem rigorously generalises RCFT modularity to that of any sufficiently 
nice VOA. Its proof is long and complicated, but follows closely the intuition of CFT. 

Zhu first defines abstractly a space of sequences (S1, S2,...) of functions, where 
each S„ maps n-tuples (a1, ..., an) E€ VS” to meromorphic functions of (z1,..., Zn, T) € 
C” x H. They obey several conditions, for example they are doubly-periodic in each 
variable z;, with periods 1 and t. Each function S,, is what we would call a chiral block 
definition abstracts out the manifest properties of this space. It is immediate from his 
definition that SL2(Z) acts on this space, in exactly the way we would expect from 
CFT. Verlinde’s formula (6.1.2) tells us that the dimensions of these spaces should be 
independent of the number n of punctures, and in fact CFT tells us that a canonical basis 


Jaras 


(ai, ..., an) > try (Yu (ai, e™™) + Yu lan, e7™)q™) (5.3.19) 


(appropriately normalised), for each irreducible V-module M. However, showing rigor- 
ously that these functions (5.3.19) in fact satisfy his definition, and that they do indeed 
span his space, are both more difficult. But we see that the modularity in Zhu’s Theorem 
arises through that SL2(Z) action on the space of chiral blocks. 

The modularity of the Jacobi characters x AeA v, w) of Definition 5.3.7 is now easy. 


Theorem 5.3.9 Let V be a weakly rational C2-cofinite VOA. Then the Jacobi charac- 
ters x},(T, v, w) are holomorphic in H for any fixed v, w, and obey 


j; (fat+b v (vlv) ) (i a) J 
’ » W c = T, V, W), 
Xu (== ct +d (ct +d) wx” end lyn Xw( ) 
(5.3.20) 
for all K J € SL2(Z), v € Vj, and w € C, where p is as in Theorem 5.3.8 and 


where the inner-product (v|v) is given by vav = —(v|v)1. 


Again, ‘c’ in (5.3.20) refers to a matrix entry and not the central charge. The transfor- 
mation on the left side of (5.3.20) is exactly that of, for example, Jacobi theta functions. 
Theorem 5.3.9 is an easy corollary of the main theorem of [426] (which in turn is a 
corollary of the proof of Theorem 5.3.8 as given in [574]). In particular, define 


Zu(T, u, v) 2 trye? ™ OM- Ng Lotou)—-C+12ulu))/24 (5.3.21a) 
for any u, v € Vj, so Kia, v, w) = exp[27iw] Z(t, 0, v). Then provided o(v)u = 0 


(i.e. u and v commute in the Lie algebra V1), [426] obtained the transformation law 


b b 
ae (<=. u, v) = > p 6 ) Zy(t,cv+du,av+bu), (5.3.21b) 
ct+d No c dun 
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b 
for any ( f d ) € SL2(Z). To prove (5.3.20), it suffices to prove it for the two generators 


1 0 
Zm follows from Proposition 1.8 of [151]. 


1 1 —1 
) and 6 , and this follows directly from (5.3.21b). Holomorphicity of 


5.3.6 Twisted #5: twisted modules and orbifolds 


Last subsection we saw how the modularity of VOA modules permits a one-paragraph 
proof that the graded dimension of the Moonshine module V* must equal J (t). How 
about the other McKay—Thompson series? In this subsection we find that the notion of 
Y-module must be generalised to the equally fundamental notion of twisted V-modules. 
Twisted modules are vaguely reminiscent of projective representations of groups, but 
while a projective representation of G is a true representation of some central extension 
of G, a twisted V-module is a true module of a vertex operator subalgebra of V. Most 
groups don’t have twisted modules, and VOAs don’t seem to have a natural notion of a 
projective module, but Lie algebras have a foot in each camp and as we see in Chapter 3 
have both kinds of modules. 

Far from being an esoteric development, twisted modules are crucial to Monstrous 
Moonshine and absolutely central to the whole theory. In CFT and string theory, they 
arise in the important orbifold construction (Section 4.3.4). Twisted modules of Lie alge- 
bras — a baby example of twisted modules of VOAs — are discussed in Sections 1.5.4 
and 3.4.1. Moonshine is the relation of VOAs to modular functions; the modular func- 
tion analogue of this twisting has long been understood and also plays a central role 
(Section 2.3.3). 

Fix a VOA V and any automorphism g € Aut(V) of order N. We can define g- 
twisted modules [185], by blending together the definitions in Sections 3.4.1 and 5.3.1. 
In particular, decompose V into eigenspaces of g: V = Sa Vi where V’ = {v € 
Y| g.v = En o). A g-twisted V-module (M, Ym) has a C-grading M = ®yecMa, with 
dim M, < œ, as in Definition 5.3.1, as well as a linear map V > End[[z*!/" ]], written 
Yutu,z) = È rez/N uz’ —!, such that (5.3.1a), (5.3.1¢) hold, 


Y(u,z) = > eye Vu e Vİ, (5.3.22a) 
ré—j/N+Z 


and (5.3.1b) becomes 


= Z1 — 22 sf Z2 — Z1 
Zo ô aa Yuu, 21) Yu(v, 22) — Zo ô ae Yu(v, 22) Yuu, 21) 


_1 [Z1 — Zo ane tae 
o ô E Yu (Y (u, zo)v, z2), (5.3.22b) 


Z2 


where u € VÍ. We say two g-twisted V-modules M, N are isomorphic if there is an 
isomorphism  : M — N satisfying Yy (gu, z) = Yy (v, z)ọ for all v € N. Note that an 
e-twisted V-module (e being the identity of G) is an ordinary V-module. 
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Any h € Aut(V) permutes the twisted V-modules as follows. Let M be g-twisted, and 
for each v € V define 


nYu(v, z) := Yy(h.v, z). 


Then (M, Ym) is an h~! gh-twisted Y-module. When h and g commute, we say the 
module (M, Yy)ish-stable if (M, Yy)and(M, ,Yy)areisomorphic. Wecallh € Aut(V) 
an inner automorphism of V, and write h € Inn(V), if every untwisted V-module is 
h-stable. 

Now let M be an irreducible g-twisted V-module, and G any group of automorphisms 
h € Aut(V) commuting with g such that M is h-stable for all h € G. Then for each h € 
G, we get an automorphism g(h): M — M of M, satisfying g(h) Yy(v, z) g(h)! = 
Yy(h.v, z). Hence we can perform Thompson’s trick (0.3.3) and write 


Z(M, h;t) := q7” try pth) q”. (5.3.23) 


These Z(M, h)’s are the building blocks of the graded dimensions of various eigenspaces 
of h in M: for example, if h has order m, then the subspace of M fixed by the automor- 
phism g(h) will have graded dimension m~! )~", Z(M, h’). 

This assignment g does not necessarily define a representation of G in End(M). 
However, g(h2)~! (hy)! (hy hz) clearly commutes with all vertex operators Yy(v, z) 
and so by irreducibility of M is a scalar multiple c,(h,, hz) of the identity. Equivalently, 
ọ is a projective representation of G: 


plhihz) = cg(h1, h2) phi) olha). (5.3.24) 


For any h, k € Cg(g) (i.e. commuting with g), g(khk7!) = ar nglkyo(h)o(k)! for 
some scalar œz n, and thus Z(M, khk™!; t) = æknZ(M, h;t) by the cyclic property 
of trace. This means that, for fixed g, it suffices to restrict to one h from each Cg(g)- 
conjugacy class. By a similar argument (Question 5.3.6), we get that Z(M, h; t) vanishes 
identically, unless for all k € Cg(g) commuting with A, the 2-cocycle of (5.3.24) satisfies 
Cg(h, k) = cg(k, h). Thus we can further restrict to those A. 


Conjecture 5.3.10 [136], [138], [152] Suppose V is a weakly rational VOA, with 
exactly n irreducible V-modules M,,..., M,. Fix any finite subgroup G of Inn(V). 
Then: 

(a) For any g € Inn(V), there will be exactly n irreducible g-twisted V-modules 
Mf, ..., Mi. Moreover, each MÈ has a conformal weight h? € Qas in Lemma 5.3.3, 
and any g-twisted V-module is completely reducible into a direct sum of the M?. 
Labelling the modules appropriately, we get (M$, nY ys) = (mish Y gie). This 
defines a projective representation (h) of the centraliser Cg(g) as in (5.3.24). 

(b) For each commuting pair g,h € G, define Zi, p(T) := Z(M, h;t). Then each 
Z(¢,n)(t) is holomorphic in H, and is a modular function for (i.e. is fixed by) some con- 


b 
) € SL2(Z), there exist scalars a(A, g, h)ij 


gruence subgroup. For any A = (i d 
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such that 


F at +b 4 ; 
Z(g,h) (= + 2) = D 8, h)ij Zane gena (T). (5.3.25) 
J= 


(c) Let V° be the vertex operator subalgebra consisting of all v € V fixed by all elements 
of G. Then the C-span of the graded dimensions of all nontwisted V° -modules will 
equal that ofall Zi, ny(T) for commuting g,h € G,and the total number of irreducible 
V°-modules will equal n times the sum, over representatives g of all conjugacy 
classes in G, of the number of inequivalent irreducible projective representations of 
Cg(g) with 2-cocycle c, as in (5.3.24). 

(d) In the special case that V is holomorphic (i.e. n = 1), Inn(V) = Aut(V) and the 
coefficients aj; in (5.3.25) are roots of unity. There is a 3-cocycle a € H>(G, U\(C)) 
such that the 2-cocycle c, of (5.3.24) is given by 


Co(hy, h2) = a(g, hy, hy) athy, ho, alh, 8, hy)". 


Some progress towards this important conjecture is provided by, for example, [150]. 
Monstrous Moonshine is interested in the holomorphic case (i.e. n = 1), which is by far 
the best understood; we return to it in Section 6.2.4. The number of irreducible projective 
representations in (c) is described in Section 3.1.1. We find in (d) that the cohomology 
group H3(G, U;(C)) = H*(G, Z) (trivial action of G on the coefficients) classifies all 
the possibilities for the orbifold; the analogous result for nonholomorphic VOAs is much 
more subtle, being more sensitive to the structure of V, and is still poorly understood. 

Part (c) leads us to a Galois theory for V°. But considering the depth of Jones’ Galois 
theory for subfactors, and the ‘Galois theory’ for lattices sketched in Section 2.3.5, it 
is clear that a far more interesting theory is possible for VOAs. It would certainly be 
interesting to develop this. 

The easiest examples of the orbifold construction are of a self-dual lattice VOA V(L) by 
a subgroup G of the automorphism group of L (see e.g. [150]). We learn in Section 5.2.2 
that there is a deep analogy between lattices and VOAs. This orbifold construction of 
VOAs corresponds directly to the shift construction of lattices outlined in Section 2.3.3. 

The most famous VOA, the Moonshine module V ", was the original orbifold. Frenkel- 
Lepowsky—Meurman [201] obtained it as the orbifold of the Leech lattice VOA V(A) 
by the +1-symmetry of A. Since A is self-dual, V(A) is holomorphic. As predicted by 
Conjecture 5.3.10, there is a unique — 1-twisted V(A)-module. We discuss this orbifold 
more in Sections 4.3.4 and 7.2.1; see also [201] for details. 


Question 5.3.1. Let V be any VOA, and let W be a vector space and T : V —> W be any 
isomorphism of vector spaces. Use this linear map T to carry the VOA structure on V 
to one on W. 


Question 5.3.2. Let V be any VOA and let Vin] be the grading induced by L[0] in (5.3.16d). 
Prove for any N > 0, that 


non = Bh_oVin- 
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Question 5.3.3. Find an expression for the coefficient of the q“ term in the one-point 
function x(t, v), using the representation theory of the stabiliser G, < Aut Y and the 
eigenvalues of the zero-mode o(v) restricted to the homogeneous space Mg. 


Question 5.3.4. Let V be any holomorphic weakly rational C2-cofinite VOA with central 
charge c = 24. Prove that its graded dimension x(t) must equal J(t) +c, where the 
constant c is dim V}. 


Question 5.3.5. (a) Relate the Jacobi character Wa of a lattice VOA V(L), for L 
positive-definite and with even integer inner-products, and the theta series ©; of (2.3.7). 
(b) Relate the Jacobi character x MOD) of an irreducible module of an integrable affine 
VOA V(g, k), for g simple, with the affine algebra character x, of (3.2.9a). 


Question 5.3.6. Let M be g-twisted. Show that the series Z(M, h) of (5.3.23) is identically 
0, unless A € Cg(g) has the property that, for allk € Cg(g) commuting withh,c,(h, k) = 
Cg(k, h). (Hint: first show that Z(M, hk) is identically 0 if c(h, k) A cg(k, h); then use 
the 2-cocycle condition (3.1.1b).) 


5.4 Geometric incarnations 


Vertex (operator) algebras are a deep construct and, in spite of their complexity, are 
here to stay. In this section we describe some connections with geometry. Section 5.4.1 
describes the programme to rigorously construct CFTs in Segal’s sense (Section 4.4.1), 
from VOAs. Section 5.4.2 reviews the geometric side of vertex operator superalgebras. 


5.4.1 Vertex operator algebras and Riemann surfaces 


The introductory chapter stated that the physics of Moonshine exploits the duality 
between Hamilton’s and Feynman’s pictures of CFT. Manin put it this way back in 
1985: 


The quantum theory of (super)strings exists at present in two entirely different 
mathematical fields. Under canonical quantization it appears to a mathematician as 
the representation theory of algebras of Heisenberg, Virasoro, and Kac—Moody and 
their superextensions. Quantization with the help of the Polyakov path integration 
leads to the analytic theory of algebraic (super)curves and their moduli spaces, 
to invariants of the type of the analytic curvature, etc. Establishment of direct 
mathematical connections between these two forms of a single theory is a big and 
important problem. [402] 


Our best answer to Manin is the theory of geometric vertex operator algebras. 

Note that any time we have an algebraic structure with a binary operation (e.g. 
“product’) ab, we can express multiple products using binary trees, which keep track of 
the brackets. For example, the binary trees in Figure 5.1 correspond to the products XY 
and A((BC)D), respectively. The external (i.e. valance 1) vertices are assigned vectors, 
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B C 


Fig. 5.1 Some binary trees. 


ix 


A BCD 
Fig. 5.2 Associativity. 


while each internal vertex corresponds to a single product. Different algebraic structures 
can be axiomatised from this ‘geometric’ point of view. For instance, if the product is 
associative (e.g. we have a group), then it doesn’t matter where we place the brackets — 
for example, the above ABCD-binary tree can be replaced with the tree in Figure 5.2. 

More interesting for us are the geometrical axioms for Lie algebras [294]. Let V be any 
Lie algebra. Then to any binary tree with n legs corresponds a linear map g from n copies 
V ®@---@¥V of the vector space V, to V. The map corresponding to the ABCD-binary 
tree of Figure 5.1 takes the Lie algebra vectors A, B,C, D to the nested Lie bracket 
[A[[BC]D]]. It is then fairly straightforward to encode all properties of the Lie algebra 
in the language of trees. For example, anti-commutativity says that if we flip the two 
descendents of an inner vertex of the tree — for example, in Figure 5.1 flipping D with 
the 3-vertex tree containing B and C — then the corresponding maps ¢ differ by a factor 
of —1. Gluing the root (uppermost vertex) of one tree to an external vertex of another 
corresponds in the Lie algebra to inserting one nested bracket into the middle of another. 
The only nontrivial property is anti-associativity (see Question 5.4.2). The result is a 
formulation of Lie algebra that is completely equivalent to the usual algebraic one [294]. 

Now, if we ‘two-dimensionalise’ that definition of “geometric Lie algebra’, we get 
something called a geometric VOA [295] that is equivalent to the ‘algebraic’ VOA of 
Definition 5.1.3. In place of binary trees (Figure 5.1), we have spheres with tubes (Fig- 
ure 5.3). Equivalently, a sphere with n tubes is the Riemann sphere with n marked points 
and a choice of local coordinate at each point — an enhanced surface of type (0,7) (Sec- 
tion 2.1.4). The moduli space of binary trees with n legs is a finite set, but the moduli 
space of spheres with n tubes is an infinite-dimensional complex space. To each such 
sphere with tubes we get a linear map g from n copies of our vector space V (which 
is our VOA) to Y (or rather a certain completion of V — a complication caused by the 
infinite-dimensionality of V). A geometric VOA satisfies meromorphicity requirements, 
and most importantly the sewing axiom. In fact this map ¢ is Segal’s functor S described 
in Section 4.4.1. 
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B C 


Fig. 5.3 The surfaces corresponding to Figure 5.1. 


The point is that the resulting notion of geometric VOA is equivalent to that of algebraic 
VOA [295], though it takes considerable effort to show this. Thus a VOA is an ‘algebra’ 
with a two-dimensional analogue of a binary operation. In particular, let P,, be the 
simplest pair-of-pants, namely the Riemann sphere PC! with marked points 0, 00 and w 
and local coordinates given by z, 1/z andz — w (z being the global coordinate on C). Then 
the formal series Y (u, w)v corresponds to S(P,,)(u ® v). On the other hand, consider the 
annulus, that is the Riemann sphere with marked points 0 and oo, with local coordinates 
z and exp[—ez!d/dzz~']. Recalling the realisation —z~!d/dz = £- € Witt and the 
formula w = L21, we can recover the conformal vector w by differentiating with respect 
to e the map obtained from S. The Virasoro algebra is fundamental here, capturing 
the effect of changing local coordinates (recall (5.3.15c)), and is responsible for the 
meromorphicity in the geometric VOA. The Jacobi identity (5.1.7a) is obtained from 
the sewing axiom. This equivalence relates formal power series (algebra) to distribution 
theory (analysis). It proves that the chiral blocks (v, Y (u1, 21)---Y (un, Z,)v’) will be 
meromorphic, except for poles at z; = Zj. 

As mentioned before, a group corresponds to trees such as Figure 5.2. We can also 
two-dimensionalise that, and obtain what Huang calls a vertex group [293]. The easiest 
examples are C* and the enhanced moduli space Mo, 1. Vertex groups should be to VOAs 
what Lie groups are to Lie algebras. 

The motivation for this deep work is to construct examples satisfying Segal’s defini- 
tions of CFT and modular functor. We know at present no nontrivial examples, although 
the general belief is that any sufficiently nice VOA will provide one. Huang’s work [295] 
establishes this for genus 0, and more recently he has pushed it to genus 1 [297]. 

We end this subsection on a more speculative note [560], [295]. According to Wit- 
ten, to understand string theory conceptually, we need a new analogue of Riemannian 
geometry. In contrast to the more classical “particle-math’, there is a more modern 
“string-math’. We have the real numbers (particle physics) versus the complex numbers 
(string theory); binary trees versus spheres with tubes; Lie algebras versus VOAs; the 
representation theory of Lie algebras versus RCFT, etc. What are the stringy analogues 
of calculus, ordinary differential equations, Riemannian manifolds, the Atiyah—Singer 
Index theorem, ...? Huang suggests that just as we could imagine Moonshine as a mys- 
tery that is explained in some way by RCFT, perhaps the stringy version of calculus would 
similarly explain the mystery of two-dimensional gravity, stringy ODEs would explain 
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the mystery of infinite-dimensional integrable systems, stringy Riemannian manifolds 
would help explain the mystery of mirror symmetry, and the stringy index theorem would 
help explain the elliptic genus (for this latter possibility, consider the work of Tamanoi 
reviewed next subsection). 

What makes this more subtle is that complexification is not unique. To give a simple 
example, S! can be thought of as the real projective space P!(R) and as the Lie group 
SO2(R). The obvious complexification of P”(R) is P’(C). An obvious complexification 
of SO,,(R) is SO,(C). But if we think of O, (R) more geometrically as the real matrices 
that preserve the quadratic form x? + <- - + x2, then its complexification should be those 
complex matrices that preserve the Hermitian form |x, 7 +- + |x, ie i.e. U,(C). Thus 
the complexifications of S! in these cases would be the 2-sphere P!(C), the cylinder 
SO2(C) (i.e. the multiplicative group C/{0}) and the 3-sphere SU2(C) (as a real Lie 
group). So the specific complexification obtained depends on the context. In all cases, 
the way to proceed is to convert the defining relations of the given object into symbols 
that make sense over C. 

What sense can we make of the statement that the complexification of a binary tree is a 
sphere with punctures? Consider the simplest case: the segment 0 < x < 1. This can be 
thought of geometrically as the locus (a, b, c) € R? satisfying a + b? —1=a—c* = 0. 
Over the complex numbers the parameter ‘a’ is redundant, and this locus has the obvious 
complexification w? + z? = 1. We know this is a sphere with two punctures, that is, a 
cylinder as we would like it to be. 

Incidentally, Arnol’d speculates that there is in fact a triality: the reals, the complex 
numbers and the quaternions. He discusses several examples in [18], as well as some 
applications of this thought. This suggests that there is a third structure, generalising 
vertex algebras much as vertex algebras generalise Lie algebras. 


5.4.2 Vertex operator superalgebras and manifolds 


Through the work of Witten and others, we have discovered that much can be learned 
about a space X, by studying a string theory living in X. Much of this is reviewed in [291]. 
For example, to a Calabi-Yau manifold X [299], [571] and an element of its complexified 
Kahler cone, string theory associates two N = 2 superconformal field theories, called the 
A and B models (which focus on respectively the Kahler and complex structures of X). 
To clarify (and rigorise) these ideas, Malikov-Schechtman—Vaintrob [401] suggested 
how one may construct, given X, the vertex algebra of the N = 2 superconformal field 
theory (the A model) associated with X. This work is clearly fundamental. We can only 
sketch it here. 

To any smooth complex variety X, reference [401] associates a sheaf of (N = 1) vertex 
operator superalgebras, called the chiral de Rham complex MSV x. In other words, to 
every open set U C X, there is a near-vertex operator superalgebra MSV x(U) (the 
‘space of sections of MSV, over U’). Whenever sets U C V are open, there is a 
surjective restriction map ry : MSVx(V) > MSVx(U), which is a homomorphism 
of near-vertex operator superalgebras. We briefly discuss vertex operator superalgebras 
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in Section 5.1.3. These near-vertex operator superalgebras are bi-graded, by commuting 
operators Lo (the Hamiltonian) and Jo (the fermionic charge) with eigenvalues N and Z, 
respectively. They form a complex in the sense that there is a differential Q grsr obeying 
Q? rsr = Oand increasing fermionic charge by 1. When the open set U is homeomorphic 
to an open ball in C”, then MSV (UV) is essentially the tensor product of n copies of 
what string theory calls a bosonic (y) ghost system (similar to the Heisenberg VOA), 
with n copies of a (bc) fermionic ghost system. The physics of these ghost systems is 
described in [463]. 

The prototypical example of ‘sheaf’ is the structure sheaf Ox, which associates 
with each open set U the space of functions f : U — C. The prototypical example of 
‘complex’ is the de Rham complex, given by the space of differential forms on X, with 
a differential d obeying d? = 0 and taking p-forms to p + 1-forms. Of course the point 
of a complex is to take the cohomology H* = kerd/imd. The books [537] are a read- 
able introduction to algebraic geometry; in particular section 2.2 provides elementary 
examples of sheaves, and section 6.1 treats sheaf cohomology. For a sheaf F over X, 
H(X, F)is always the global section F(X), and it is common for the other H'(X, F) to 
all vanish. The name ‘chiral de Rham complex’ was chosen because the Ly = 0 subspace 
can be identified with the familiar space of differential forms (‘chiral’ refers to the chiral 
algebra of Section 4.3.2 or the chiral ring discussed in [291]). 

In the case of MSV x, the sheaf cohomology H*(X, MSVyx) yields the global section 
MSYVx(X), which is a near-vertex operator superalgebra. The case where X is Calabi— 
Yau is the most interesting, as MSV x(X) has N = 2 (rather than merely N = 1) super- 
symmetry, which makes it much richer. MSV (X) is a fundamental invariant associated 
with X, and much information of X can be recovered from it. For example, the usual de 
Rham cohomology Hý p(X) of X is H*(MSVx(X); Ogrsr). For another example, the 
elliptic genus (discussed shortly) of X equals trusyy xq’? y” [81]. 

Elliptic genus appeared in the mid-1980s in both string theory and topology. For 
details see, for example, [287], [499], [523]. In Thom’s cobordism ring Q, elements are 
equivalence classes of cobordant manifolds, addition is connected sum and multiplication 
is Cartesian product. The universal elliptic genus ¢(M) is a ring homomorphism from 
Q ® & to the ring of power series in q, which sends n-dimensional manifolds with spin 
connections (see [369] for the relevant geometry) to a weight n/2 modular form of To(2) 
with integer coefficients. Several variations and generalisations have been introduced, for 
example, the Witten genus assigns spin manifolds with vanishing first Pontrjagin class 
a weight n/2 modular form of SL2(Z) with integer coefficients. On a finite-dimensional 
manifold M, the index of the Dirac operator (in the heat kernel interpretation) is a path 
integral in supersymmetric quantum mechanics, that is an integral over the loop space 
LM = {y : S! > M}; the string theory version of this is that the index of the Dirac 
operator on CM should be an integral over L(£ M), that is over smooth maps of tori 
into M, and this (heuristically) is just the elliptic genus, and explains why it should be 
modular. 

The important rigidity property of the Witten genus with respect to any compact Lie 
group action on the manifold is a consequence of the modularity of the characters of affine 
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algebras (our Theorem 3.2.3) [388]. In physics, elliptic genera arise as partition functions 
of N = 2 superconformal field theories [561]. The Witten genus (normalised by n8) of 
the Milnor—Kervaire manifold Mè, an eight-dimensional manifold built from the Eg 
diagram, equals j3 [287]. Also, the elliptic genus of even-dimensional projective spaces 
P?” (C) unexpectedly has only nonnegative coefficients and in fact equals the graded 
dimension of a certain vertex algebra [400]; this suggests interesting representation- 
theoretic questions in the spirit of Monstrous Moonshine. Exciting developments are 
described in [517], including relations with von Neumann (sub)factors. 

Related to MSV x must be the work of Tamanoi [521]. The index of an operator d 
is kerd — cokerd; we can interpret this geometrically as the superdimension associated 
with the ‘superpair’ (ker d; coker d) of vector spaces. This is what Tamanoi does with the 
elliptic genus. In particular, to each closed Riemannian manifold X he associates a vertex 
operator superalgebra 7 (X), determined from its geometry. It has a nonnegative half- 
integer grading and central charge N = dim X/2. The Riemannian metric of X yields 
the conformal vector w. In the special case of a Kahler manifold, the Kahler forms (i.e. 
the closed real differential forms of type (1,1)) form a level 1 representation of the affine 
algebra Dy”. Again, the elliptic genus is recovered as the graded dimension of T(X). 
It is obviously desirable to relate these invariants 7 (X) and MSV x(X). We return to 
elliptic genus in Section 7.3.7. 


Question 5.4.1. Find a complexification for the Mobius band. 


Question 5.4.2. In a non-associative algebra, the ambiguous product v; - -- v, can only 
be evaluated when the n — 1 pairs of brackets are placed. Let L be any Lie algebra. Prove 
that for any n > 3, L has an identity of the form 


V1 ots Un = VIV V03 +++ Vn=1 Het UL UZ UAVi4 + Vn Het HF UyU2 + Vn. 


More precisely, for any choice of bracketing on the left, prove that there is a choice of 
bracketing for each of the n — | terms on the right such that the resulting formula holds 
for any v; € L. For example, [[v;v2]v3] equals [[v; v3]v2] + [vi[vev3]] and [[vi[v2v3]]v4] 
equals [[v; v4][v2v3]] + [vı [[v2v4]v3]] + [vı [v2[v3v4]]]. 


6 


Modular group representations throughout the realm 


There are two aspects to Moonshine. The more general one is the unexpected presence 
of modular group actions over a wide range of algebraic settings, and is now fairly well 
understood. We have seen instances of this already with, for example, the characters 
of affine algebras and VOAs. This chapter completes our treatment of these modular 
actions. The more specific aspect — the association of Hauptmoduls to the Monster — is 
still poorly understood and is the subject of the following chapter. 

Much of this chapter is orthogonal to Monstrous Moonshine. For example, we dis- 
cuss here fusion rings and modular data; both the fusion ring and modular data of the 
Moonshine module V’ are maximally trivial. Nevertheless, this chapter helps to paint 
the general context of Monstrous Moonshine. In Section 7.2.4 we build on some of 
the lessons from this chapter to speculate on a possible second proof of Monstrous 
Moonshine. 


6.1 Combinatorial rational conformal field theory 


Recall the semi-simple Lie algebras: we study their structure and obtain their classifi- 
cation by abstracting out combinatorial features (e.g. roots, Coxeter-Dynkin diagrams). 
Of course this is easy to do with a finite-dimensional linear structure. RCFTs are infinite- 
dimensional, but by definition their infinite-dimensional symmetry and implicit rigidity 
again effectively reduces them to certain discrete structures. As we see next section, those 
discrete structures are remarkable for their ubiquity in modern mathematics. See [208], 
[207], [33], [131], [437], [236] for further background. As with all other chapters except 
Chapter 7, we’ve tended to avoid giving original references, as these are voluminous and 
can be recovered from the numerous review articles and books. 


6.1.1 Fusion rings 


Recall that the eigenvalues of a self-adjoint (equivalently, Hermitian) matrix are all 
real. Consider the following scenario. Let A, B andC ben x n Hermitian matrices with 
eigenvalues a) > 2 >--- > Gy, Bi > +--+ > Bn, Yı = +++ = Yn. What are the conditions 
on these eigenvalues so that C = A + B? The answer consists of a number of inequalities 
involving the numbers a;, Pj, yk. Now discretise this problem: 


Theorem6.1.1 Leta; >a. > -+--> An >20, p1 >- > Bn 20, y>- > Mm =O, 
all be integers. Then the following are equivalent: 
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(i) Hermitian matrices A, B andC = A+ B exist with eigenvalues a, B, y, 
respectively; 
(b) the gl,(C) tensor product multiplicity Tip is nonzero. 


Recall from Section 1.5.1 that the finite-dimensional unitary irreducible modules of 
the Lie algebra gl,,(C) = C @ sl,,(C) are naturally labelled by pairs (a, 4) € R x N”-!, 
where z +> iaz is a representation of the abelian Lie algebra C, and A = (Ay, ..., An—1) 
is a highest weight for the simple Lie algebra sl,,. The eigenvalues a correspond to labels 
a =a, andi; = a; — aj4,. The number T is the number of times the gl„-module L(y) 
appears in the tensor product L(a~) & L(B) of modules. This remarkable theorem and 
related results are discussed in the review article [218]. 

Now consider instead n x n unitary matrices with determinant 1. Any such matrix 
D €SU,(C) can be assigned a unique n-tuple 6 = (6),...,6,) as follows. Write its 
eigenvalues as e275: where 6; > -+ > dy, D 6; = 0 and 6; — 6, < 1. Let A, be 
the set of all such n-tuples 5, as D runs through SU, (C). Note that D will have finite 
order iff all 5; € Q, and that D will be a scalar matrix d/ iff all differences 6; — ô; € 
Z. Of course, a sum of Hermitian matrices corresponds here to a product of unitary 
matrices. 


Theorem 6.1.2 [4] Choose any rational n-tuples a, B, y € An O Q". Then the follow- 
ing are equivalent: 
(i) there exist matrices A, B, C € SU, (C), with C = AB, with n-tuples a, B, y; 
(ii) there is a positive integer k such that all differences ka; — kæ j, kB; — kpj, 
ky; — kyj are integers, and the fusion multiplicity Ne of st,” at level k is 
nonzero. 


We met the affine algebra shp P = A,_,") and its modules in Section 3.2. Here, 
ka corresponds to the level-k integrable highest weight A € Poa) with Dynkin 
labels A; = ka; — kaj. The spP fusion multiplicities are studied in Section 6.2.1. 
Theorems 6.1.1 and 6.1.2 provide one instance of a general principle: 


A result or construction valid for gl, or $l, tensor products should have an inter- 
esting analogue for the sl, fusion product. 


The gl,, tensor product multiplicities are classical quantities, appearing in numerous 
and varied contexts. The sl," fusion multiplicities are equally fundamental, equally 
ubiquitous, but less well understood. 

Just as the tensor product multiplicities are structure constants of the character ring of 
the Lie algebra, so do fusion multiplicities define a fusion ring, an aspect of Moonshine 
complementary to Monstrous Moonshine. 


Definition 6.1.3 A fusion ring R = R(B, N) is a commutative ring R with unity 1, 

together with a finite basis B = {xq |a € ®} (over Z) containing 1 = Xo, such that: 

Fl. The structure constants N€,, defined by XaXp = Xoco Nj, Xe. are all nonnegative 
integers. 

F2. There is aring homomorphism x +> x* stabilising the basis ® (we write (Xq)* = Xq«). 


356 Modular group representations 


F3. Ni, = Spat: 

F4. ‘S = S”? (we ll explain this shortly, but it says R is self-dual in a strong sense). 
The numbers Nj, are called fusion multiplicities, the labels a € ® are called primaries, 
0 € ® is called the vacuum and ‘x’ is called charge-conjugation. 


The only reason for distinguishing the basis 6 from the labels ® is that for fusion rings the 
multiplicative notation (e.g. unit 1) is natural, but in the traditional examples of modular 
data additive notation is used. The terminology here comes from RCFT. 

An important ingredient of fusion rings, as with character rings, is their preferred 
basis 6. Abstract rings don’t come with a basis. Forgetting the basis £, fusion rings 
aren’t interesting: for example, the algebra R @z C over C (i.e. the span over C of £, 
retaining the same multiplication and addition) is isomorphic as a C-algebra to C!!®ll 
with operations defined component-wise (see Lemma 6.1.4 below). This is reminiscent 
of the character ring of the Lie algebra X,, which is isomorphic (as a C-algebra) to a 
polynomial algebra in r variables. 

For each a € ©, define the fusion matrix Na by 


(Na)b,c = NG. 


Note that the fusion matrix Mọ equals the identity matrix 7, and Nar = (Nz)! (Question 
6.1.1). The fusion matrices can be simultaneously diagonalised: 


Lemma 6.1.4 (a) Given any fusion ring R = R(®, N), there is a unique (up to 
ordering of the columns) unitary matrix S, with rows parametrised by ® and columns 
by say ®', obeying both 

Soi > 0, (6.1.1a) 


Sai Spi Sci 
pe — 6.1.1b 
z D a (6.1.1b) 
for alla,b,c € ® andi € ©’. 

(b) All simultaneous eigenspaces of all the fusion matrices are of dimension 1, and are 


spanned by each column Sy p. 


The proof of Lemma 6.1.4 only involves F1—F3. The condition F4 can now be expressed by 

requiring that the S of Lemma 6.1.4 (for some ordering of the columns) obey S = S' (so 

p’ = ®). The proof of Lemma 6.1.4 is elementary — the fusion matrices commute with 

each other and hence with their transposes, and so are simultaneously diagonalisable — 

and analogues hold in much greater generality. Equation (6.1.1b) says that the bth column 

Sy» of S is an eigenvector of each N,, with eigenvalue ae . From the unitarity of S, we 
Sa 


know that ze = x< can hold for all a € ®, only if b = c, which gives us part (b). 


The matrix S acts a lot like the character table of a finite group; a general theorem 
valid for character tables has a fusion ring analogue. 


Note that a priori the rows (parametrising basis vectors) and columns (parametrising 
eigenvectors) of S in Lemma 6.1.4 play entirely different roles. In a natural sense [236], 
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the dual ring to R has structure constants given by replacing S in (6.1.1b) with its 
transpose S“. This is what underlies calling F4 a self-duality condition. In contrast, the 
character ring of a finite group is fusion-like, is diagonalised by the character table, but its 
dual involves multiplying conjugacy classes and is isomorphic to the character ring only 
for abelian groups. The appearance of self-duality here may seem somewhat mysterious, 
but some sort of self-duality is pervasive in the mathematics of this chapter. In particular, 
Drinfel’d’s ‘quantum double’ construction (Section 6.2.3) generates algebraic structures 
possessing fusion rings and modular data, by combining a given (inadequate) algebraic 
structure with its dual in some way. An example is provided by Section 6.2.4, where the 
true (self-dual) fusion ring of a finite group is built up out of the character ring and its 
dual. 

Fusion rings arise naturally in RCFT (Sections 4.3.2 and 6.1.4). The ‘primaries’ are 
the chiral primaries, parametrising the irreducible modules of the chiral algebra V. The 
fusion multiplicities Mf, are the dimension of the space of chiral blocks a ona 
sphere with three punctures (two ‘incoming’ and 1 ‘outgoing’), where we label those 
punctures with the primaries a, b, c. Equation (6.1.1b) is called Verlinde’ s formula [542], 
and S has an interpretation in terms of modular transformations of the characters (4.3.9a). 
A similar formula gives the dimension of any space of chiral blocks: 

dim Bee sais bn = N&ntm hr Ges bn 


eck an pasa 


Be Soy? Saye tah: Sane Sorc wie Sone (6.1.2) 
ced Soc Soc Soc Soc 

The depth of Verlinde’s formula (6.1.1b), (6.1.2), which is considerable, lies in this 
modular interpretation given to S. The S matrix is called the modular matrix for this 
reason. Historically [50], the fusion ring arose directly by interpreting the chiral OPE 
symbolically in terms of products of V-families of chiral fields (see e.g. section 7.3 of 
{131)). 

Recall Perron—Frobenius theory from Section 2.5.2. The fusion matrices M4 are non- 
negative, and it is indeed natural to multiply them: 


NaN = JONEN 


ce® 


So we can expect Perron—Frobenius to tell us something interesting. By (6.1.1a), the 


S, 


Perron—Frobenius eigenvalue of Ma is Soo hence we obtain the important inequality 


SaoSop = |Sav| Soo- (6.1.3a) 
Unitarity of S applied to (6.1.3a) forces 
minaco Sao = Soo. (6.1.3b) 


The quantum-dimension Dia) of (5.3.12) equals $, and so is bounded below by 1. 
The borderline case of (6.1.3b) are those primaries a € ®, called simple-currents 
in RCFT, obeying Sao = Soo. To any such simple-current j € ®, there is a phase 
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gy; : ® — Cand a permutation J of ® such that j = JO and 


Sja, b = Pj (b) Sap, (6.1.4a) 


N? a = Sb, Ja- (6.1.4b) 


For example, we see from (4.3.11e) that € is a simple-current for the Ising model, with 
phases e(0) = pe(€) = 1 and e(o) = —1. 

It is clear what plays the role of the endomorphism ‘*’ in the character ring of a finite 
group: complex conjugation. So take the complex conjugate of (6.1.1b). We find that S 
also simultaneously diagonalises the fusion matrices N4. Hence from Lemma 6.1.4(b) 
there is a permutation of ®, which we denote by C, and some ap € C, such that 


Sap = Op Saco: 
Unitarity of S forces each |a,| = 1. Looking at a = 0 and applying (6.1.1a), we see that 
the a, must be positive. Hence 


Sap = Scan = Saco, (6.1.5) 
so as a permutation matrix, C = $*. Comparing F3 to Verlinde’s formula (6.1.1b), we 
find that C is charge-conjugation: Ca = a*. Note that C, like complex conjugation, is 
an involution, and that Coo = 1. 

More generally, recall our discussion of cyclotomic fields and their Galois automor- 
phisms from Section 1.7. The character values ch(g) of a finite group G lie in the 
cyclotomic field Q[&], for the root of unity € = &g. Write ø for the automorphism of 
Q[é] defined by o¢(€) = ££, for some integer £ coprime to ||G||. Then oy acts on the 
character table by 


oe(ch(g)) = ch(g*) = ch” (g), (6.1.6) 


for some character ch” of G (to see which one, use the fact [308] that every G- 
representation is equivalent to a matrix representation with all entries in Q[E\G\]). 


Theorem 6.1.5 [114] Choose any fusion ring, and let S be the associated modular 
matrix. The entries Sap of the matrix S lie in some cyclotomic field QlEn ]. Given any 
Galois automorphism o € Gal(Q[éEy ]/Q), 


o (Sab) = Eo (a) Sac ,b = Eo (b) Sapo (6.1.7) 


for some permutation b +> b° of ®, and some signs (parities) €, : ® > {+1}. 


This is a fundamental symmetry of fusion rings, or rather their modular matrices. For 
example, for ø equal to complex conjugation, (6.1.7) reduces to (6.1.5). Equation (6.1.7) 
is essentially the statement that the fusion multiplicities are rational numbers; the cyclo- 
tomicity follows from Theorem 1.7.1 and depends crucially on self-duality F4. Any 
property of charge-conjugation seems to have an analogue for any of these Galois sym- 
metries, although it is usually more complicated. 

What has a fusion ring to do with ‘modular stuff’? That is explained next. 
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6.1.2 Modular data 


Choose any even integer n > 0. The matrix 


S= (jen) (6.1.8) 
Jn O<m,m' <n 


is the finite Fourier transform. Define the diagonal matrix T by Tpm = exp(rim?/n — 
xi/12). The assignment 


0 -1 11 
E z Jes e eT (6.1.9) 


defines an n-dimensional representation p of SL2(Z), using (2.2.1a). This is the simplest 
(and least interesting) example of modular data. Verlinde’s formula (6.1.1b) associates 


a fusion ring with (6.1.8). Here the labels are ® = {0, 1,..., n — 1} and the fusion ring 
is the ring of integers Z[é,,] with preferred basis 
b = f1, én, see PERIT, 


The fusion multiplicities are given by addition mod n. 
This SL2(Z)-representation (6.1.9) is realised by modular functions in the following 


sense. For each a € {0,1,..., — 1}, define the functions 
( ) l 3 n (k+a/n)}/2 
XalT) = ——~ q , 
a n(t) E 


where as always q = e?7'T and y(t) is the Dedekind eta function (2.2.6b). Then (4.3.9) 
hold. Thus x = (xa X <o İS a vector-valued modular function with multiplier p for SL2(Z) 
(Definition 2.2.2). 


Definition 6.1.6 Let ® be a finite set of labels, one of which — denote it 0 — is 

distinguished. Modular data are matrices S = (Sap)a,neo, T = (Tap)a,beo of complex 

numbers such that: 

MDI. S,T are unitary and symmetric, and T is diagonal and of finite order. That is, 
T^ = I for some N. 

MD2. Soa > Oforalla € ®. 

MD3. S? = (ST)?. 

MD4. The numbers N‘, defined by (6.1.1b) are nonnegative integers. 


From the presentation (2.2.1a) of the modular group SL2(Z), we see that modular data 
defines a representation of SL2(Z), as in (6.1.9). Modular data abstracts out the SL? (Z) 
action arising in unitary RCFT (for non-unitary RCFT, Mp2 should be weakened). It is 
a significant refinement of fusion rings. In particular, most fusion rings are not realised 
by any modular data (Question 6.1.5), but those that are are always realised by at least 
three sets of modular data. 

We can generalise (6.1.8) using lattices (recall Section 1.2.1). If we write L for the 
lattice ./nZ, then L* = FZ is the dual lattice, the labels {0, ..., n — 1} parametrise the 
cosets L*/L, and the modular function x, is the theta series of the ath coset, normalised 
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by n. More generally, any even lattice L defines modular data in this way. The vacuum 
‘0’ will be [0] = L. The fusion multiplicities NE [b] equal the Kronecker delta d,¢] la+b]> 
so the fusion product is given by addition in the finite group L*/L. All primaries [a] € ® 
are simple-currents (6.1.4), corresponding to permutation Jj,\([b]) = [a + b] and phase 
Ya\([b]) = ertiad Charge-conjugation (6.1.5) is given by C[a] = [—a]. The Galois 
action (6.1.7) here is also simple: there is a Galois automorphism oy for any integer £ 
coprime to the determinant |L|; og takes [a] to [£a], and all parities €¢([a]) equal +1. 
From our point of view, however, this lattice example is a little too trivial. 

In RCFT (Section 4.3.2), the labelsa € ® are the chiral primaries and ‘0’ is the vacuum 
state. The matrix T equals (4.3.10). Charge-conjugation C is a symmetry in quantum 
field theory that interchanges particles with their anti-particles (and so reverses charge, 
hence the name). The modular data S, T arise through (4.3.9), where x, are the one-point 
functions on a torus. The above lattice example corresponds to the string theory of m 
free bosons compactified on the torus R”/L, where m = dim L. 

Every property of fusion rings should have an analogue in modular data. For example, 
the analogue of (6.1.5) is 


Tca,cb = Tab, (6.1.10a) 


which says that T andC = S? = (STP commute. The analogue of (6.1.4) is 


Tja, JaTaa = g;j(a) Tj; Too. (6.1.10b) 


In all known examples, including all those associated with RCFT [37], Galois is 
intimately connected with the existence of characters x, realising the modular data as 
in (4.3.9), which are modular functions for a congruence subgroup (recall (2.2.4)). In 
particular, for all these examples, we get the remarkable property: 


Definition 6.1.7 (congruence property) Let S,T be modular data, and let p be the 
associated SL>(Z)-representation. Let N be the order of the matrix T , so TN =1.Then 
we say S,T obey the congruence property if the following are all satisfied: p is trivial 
(i.e. with value I) on the congruence subgroup T(N), and so defines a representation of 
the finite group SL7(Zy ); we have characters Xa realising the modular data in the sense 
of (4.3.9), and those characters are modular functions for T(N); the entries Sa, all lie 
in the cyclotomic field Q[Ey ]; and finally, the Galois automorphism o¢ corresponds to 


the modular transformation € SLo(Zy), and so we get 


0 e! 
e 0 
p a = eela) par,  Va,be®, (6.1.11a) 
0 £ i 
Tace at = (Tra),  Wae®. (6.1.11b) 


The finite group SL2(Zy) arises as SL2(Z)/ T(N). The quantity ‘£7! denotes the mul- 
tiplicative inverse of £ (mod N), and exists because gcd(¢, N) = 1. We return to the 
congruence property in Section 6.3.3. Probably Definition 6.1.6 is so weak that some 
‘sick’ S, T are examples. It is expected, however, that all reasonably healthy modular 
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data, for example, modular data associated with nice CFTs, VOAs or modular categories, 
would obey the congruence property (or at least something close to it). It is known [169] 
that modular data obeying the congruence property will typically (always?) be realised 
by some vector-valued modular function as in (4.3.9). 


6.1.3 Modular invariants 


Modular data axiomatises the appearance of SL2(Z) in unitary RCFT. Two places mod- 
ular data directly impacts on RCFT are Verlinde’s formula (6.1.2) and the partition 
function (4.3.8b). 


Definition 6.1.8 Choose any modular data S,T.A modular invariant is a matrix Z, 
with rows and columns labelled by ®, obeying: 

Mil. ZS = SZ and ZT =TZ; 

MI2. Zap € N for alla,b € ®; and 

MI3. Zoo = 1. 


It will be convenient at times to rewrite ZS = SZ as SZS = Z (recall that S is unitary). 
The easiest modular invariants are the identity Z = J and charge-conjugation Z = C. 
More generally, Z is a modular invariant iff C Z is. 

Modular invariants axiomatise the 1-loop partition functions Z(t) (4.3.8b) of RCFT. 
More precisely, an RCFT consists of two VOAs, called chiral algebras. For convenience 
we will take them to be isomorphic, though this is not necessary (when they aren’t isomor- 
phic, the theory is called ‘heterotic’). The modular invariant describes how these VOAs 
act on the state space H, that is how H decomposes into modules of the chiral algebras: 


H= Pa, bed Zab Ha & Hp. 


MI2 holds because the Zap are multiplicities. The adjoint module Ho ® Ho contains 
the vacuum 1 @ 1, and M13 says there should be only one vacuum. Finally, the 1-loop 
partition function Z(t), being a physical correlation function defined on the torus, 
must be invariant with respect to the modular group SL2(Z) of the torus. Equivalently, 
Z(t) = Z(—1/t) = Z(t + 1). Applying (4.3.9) and the unitarity of S and T gives the 
modular invariance condition m11. 

Perhaps it is because of their basic importance to RCFT, but the lists of modular 
invariants associated with affine algebras (Section 6.2.1) are quite remarkable. They also 
play natural roles for subfactors and VOAs, as we’ll see. 

A second partition function, playing the same role for boundary CFT (the open string) 
that Z(t) plays for bulk CFT (the closed string), is that corresponding to a cylinder. Its 
coefficient matrices M» define a fusion ring representation (6.2.6), called a NIM-rep 
[47], [236]. Although they are a fascinating part of the bigger picture, we’ll say little 
about them in this book. 

Fix a choice of modular data. Commutation M11 of Z with T is trivial to solve, since 
T is diagonal: it yields the selection rule 


Zab #0 => Tra = Top. (6.1.12) 
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More subtle and valuable is commutation with S. In particular, each symmetry of S 
yields a symmetry of Z, a selection rule telling us certain entries of Z must vanish, and 
a way to construct new modular invariants. 

First consider simple-currents j, j’. Equation (6.1.4a) and positivity tell us 


Zi j= 5 2) (C) Soc Zea Sao p (d)| < >» SocZcaSao = Zoo = 1. 
c,de® c,d 
Thus Z; y # 0 implies Z; y = 1, as well as the selection rule 
Zea FO => oc) = vj(d). (6.1.13a) 
A similar calculation yields the symmetry 
Zoro #0 => ZjaJ'b = Zab, Va, b € ®. (6.1.13b) 


The most useful application of simple-currents to modular invariants is to their con- 
struction. In particular, let j = Jo be a simple-current of order n. Then (by Question 
6.1.7(b)) we can find integers r; and Q ;(a) such that 


. Q;la) = ie Sd 
(a) = 271i ——— |, T;; Too = 2nir;-—— |. 
g(a) exp | mi a jj Too = exp | 2xirj ae 
Now define the matrix Z[j] by [489] 
7 £ 
Z[j]ab = byta.p Ô (a) +—r;), 6.1.14 
Li lap > J'ab (ow <1) ( ) 


where 6(x) = 1 when x € Z and is 0 otherwise. This matrix will be a modular invariant 
iff T;;Too is an nth root of 1. For instance, Z[0] = Z. 

Now look at the consequences of Galois. Applying the Galois automorphism o to 
Z = SZS yields, from (6.1.7) and Zap € Q, the equation 


Zab = > Eg (a) Soge Zed Sa,ob €o (b) = €o (4) €o (b) Zoa,ob- 
c,dEe® 
(Why must o commute with complex conjugation?) Because Z4, > 0, this implies the 
selection rule and symmetry 
Zab FO => €5(a) = €o (b), (6.1.15a) 
Zoea,ob = Zab; (6.1.15b) 


valid for any o. Of all the equations (6.1.13) and (6.1.15), (6.1.15a) is the most useful. 
The reader can try to construct modular invariants from certain special o¢. 


6.1.4 The generators and relations of RCFT 


In fundamental and influential work of the late 1980s, Moore and Seiberg [436], [437] 
isolated the data (finite-dimensional vector spaces and linear transformations) defining 
each chiral half of RCFT, and provided a complete set of relations they satisfy. Roughly, 
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Fig. 6.1 A vertex. 


they do for topological field theories in 2 + 1 dimensions what Theorem 4.4.4 does in 
1 + 1 dimensions. Most of their work has been rigorously clarified in the important book 
[32]. This section sketches the basic ideas. 

Their goal is to understand the spaces B(£) of chiral blocks (Section 4.3.2). As in 
Section 4.4.1, incoming strings are those boundary circles oriented oppositely to the 
surface. We can change the orientation of a boundary circle provided we also replace its 
label (a module M € ®(V)) with its charge-conjugate M* (5.3.4a). Thus, for instance, 
the spaces pis sai Pioba and gs se be abe, are naturally isomorphic in this way. 

We know from the proof of Theorem 4. 4.4 that we can build up an arbitrary surface 
with boundary by sewing together discs, cylinders and pairs-of-pants. Hence the basic 
building block is the vertex in Figure 6.1. In the spirit of the diagrams of Section 1.6.2, it 
can be written as the graph on the right. This vertex represents an intertwining operator — 
the Z; in (4.3.7). They are a natural generalisation of vertex operators (in fact they are 
often called that), and they generate the chiral blocks F in exactly the same way that 
quantum fields generate correlation functions (4.3. 1a). 


Definition 6.1.9 [199], [436] Let V be a VOA, and let (M',Y'), for labels i € ®, be 
its irreducible modules. For any a,b,c € ®, an intertwining operator of type (5) isa 
linear map 


w Yw, z) = X wmz"! (6.1.16) 
neQ 


for each w € M°, where each mode Wa) € Hom(M?”, M°) (hence the name ‘inter- 
twiner’ ), such that for all w? € M", w? € M” andv € V, wa (w?) = O for all sufficiently 
large n (depending on both w“, w? ), and we have both 
mie f 21S -1 b b 
Zo ô| —— | YQ, 21) Vw", zo)w? =2 6 Vw", 22) Y? (v, z1)w 
Zo 
a ' YY? (v, zw", z2)w?, 
Z2 
d a a 
= Vw", z) = Y(L_1 wv", z). 
dz 


Let v(e a denote the space of all Y of the given type. 
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(a) (b) 


b 
Fig. 6.2 The braiding operator Beş | a 
a 


In short, the intertwining operator obeys all the properties the vertex operator Y y obeys 
in Definition 5.3.1. Of course the z-derivative in the definition completely specifies the 
z-dependence of an intertwining operator. Note that the defining vertex operator Y (v, z) 
of a VOA is an intertwining operator of type ( nan while the vertex operator Yjy« of the 
module M“ is of type Ge ). Summing the formal power series in (6.1.16) over Q is a little 
lazy here: the sum really is overn € r + Z, wherer = wt w° — wt w? — wt w? € Q. The 
analogue of the grading vA1 here is that wt w) = wtw? — n — 1. 
The dimension of the space of intertwiners is just the fusion multiplicities: 


dim o) = dim B (29°) = Ni, < ov. (6.1.17) 


seem 


is now trivial, at this formal level: simply perform the following Feynman rules. 


(i) Fix a basis for each space vE) of intertwining operators. 

(ii) Fix some dissection of & into pairs-of-pants, as in Figure 4.12 (it is more 
convenient but not necessary to draw the corresponding trivalent graph). 

(iii) Assign to each internal cut, or equivalently each internal edge of the trivalent 
graph, a dummy label. 

(iv) To each vertex in your dissection, bounded by labels a, b, c € ® (appropriately 
oriented), choose an intertwining operator from the basis of the appropriate space 
of intertwiners. 

(v) ‘Evaluate’ the corresponding chiral block in (4.3.7) — this is a desired basis vector. 


(vi) Repeat for each operator in your basis, and each possible value of all dummy 
labels. 


For example, consider the left-most dissection in Figure 6.2(a) of a sphere with four 
boundary components. Let Y and ’ be any intertwining operators in 1 and v( aon 
respectively. Then we get a chiral block 


F = (w°, Vw", z) Vw", 7’)w’), (6.1.18) 


where Mobius invariance was used to send the b- and c-marked points to 0 and oo. 
Section 9.3 of [253] gives a more physical description of sewing. Incidentally, each 
dissection corresponds to moving towards a ‘maximally degenerate’ boundary point on 
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[le 


(a) (b) 


b a 
Fig. 6.3 The fusing operator Fsg | : | 
a 


Myn (recall Section 2.1.4), that is deforming the surface ever more closely to a trivalent 
graph. 

For each dissection, the chiral blocks of (v) are linearly independent and form a basis 
for fusion multiplicities, for any pair of dissections of each labelled surface. For instance, 
the dissections in Figures 6.2(a) and 6.3(a) tell us the nontrivial fact that 


NE = dim B (0 ED NENG =S NENES NN 6.119) 
ec® fe® ged 
These identities imply that the fusion ring of an RCFT, defined here formally to have 
structure constants N f, is both commutative and associative. All of these product for- 
mulae can be quickly deduced from Verlinde’s formula (6.1.2). 

As we’ve repeatedly mentioned, a given surface can be dissected in different ways. 
Duality here is the statement that although each dissection of & produces a different 
basis of chiral blocks, they must be bases for the same space S(X), that is there must 
be invertible linear maps relating the chiral blocks of different dissections. Consider the 
easy examples in Figures 6.2 and 6.3. There we’ve given three dissections of the (g, n) = 
(0, 4) surface. The corresponding linear maps (actually matrices, given our explicit but 


c Ë b c 
ical b denoted B = B dF = 
noncanonical bases) are denote ad De, fegBef pc as an 3 A 


b c ; Pe: Lenon satis ; 
De, ged Feg a dV For the purposes of manipulating identities, it is convenient to 


represent these operators pictorially as in (b) (recall Section 1.6.2). Because of these 
pictures, they are usually called braiding and fusing. They play the same role here as the 
Clebsch-Gordon and Racah coefficients (or 3j- and 6j-symbols), respectively, play in 
the Lie theory of the quantum mechanics literature. See also the treatment in chapter 16 
of [214]. 

The proposition at the end of [278] gives us four basic ‘moves’ from which any two 
dissections can be related. These occur for surfaces with 


(e,n) = (0, 1), (0, 2), (0, 4), (1, 1) (6.1.20) 


(namely, the surfaces that need at most one cut to unfold them into discs, cylinders or 
pairs-of-pants). The one for (1,1) is given in Figure 6.4. The corresponding operator is 
called S (a) because it corresponds to the modular transformation t + —1/t. The result 
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Fig. 6.4 The S-operator S(a). 


Fig. 6.5 A typical identity. 


of [278] is the key to proving that a few dualities generate all others. In particular, all 
duality transformations can be written in terms of F, B, S, e2Tic/24 

These duality operators obey several identities, coming from surfaces (0, 5) and (1, 2) 
(those requiring two cuts to decompose into pairs-of-pants). An example is Figure 6.5; 
another is the Yang—Baxter equation (Figure 1.29). The reader is encouraged to write 
these identities down explicitly. Figure 6.5 has the shape F BB = BF, while the Yang- 
Baxter equation looks like BBB = BBB. Other identities are given in section 3 of 
[437]. 

[436] argue, and [32] prove, that all mapping class group actions on the spaces B(X) 
can be deduced from these relations. They also argue that Verlinde’s formula (6.1.2) 
follows, by considering the space pl? : 

For example, consider the Ising model (Section 4.3.2). Here ® = {1, €, o}. Its modular 
data S, T is given in (4.3.11), and a basis for the space of chiral blocks in Bodo is given 
in (4.3.13). Its fusion ring is defined by e€ X] € = 1, € X] o =o and o X| o =19Qe. 
Recall that these blocks assume that the four points z1, ..., z4 have been mapped to 
0, w, 1, œ, respectively (so w goes to the cross-ratio). To find the fusing matrix, one 
way is to note that this duality interchanges the roles of z; = 0 and z3 = 1, and therefore 
corresponds to the Möbius transformation w > (1 — w)/(1 — 0) = 1 — w. Likewise, 
braiding interchanges z2 with z3, and so corresponds to the Mobius transformation w +> 
(0 — 1)/(0 — w) = 1/w. When applying Mobius transformations to chiral blocks, recall 
(4.3.5); equivalently, chiral blocks (of quasi-primaries) are often written as differential 
forms: here they are F; dw~!. The braiding and fusing matrices here become 


oo| 1 y y? 
al? el-al ; ). (6.1.21a) 
2 —2 
P|? e Ai (6.1.21b) 
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for some primitive 16th root y of 1. Also, S(a) is 0 x O (since NOD = 0 by (6.1.2)), 
and S(€) = (y~’). 


Question 6.1.1. (a) Directly from Definition 6.1.3, prove that the fusion ring homomor- 
phism x in F2 is an involution (i.e. x = id). 

(b) Again directly from the definition, prove that Na+ = (Wa) for any fusion matrix 
Na. 

(c) Directly from the definition, prove that the numbers Mape := NG in any fusion ring 
are completely symmetric in a, b, c. 


Question 6.1.2. Choose your favourite character table theorem in, for example, [308] 
and find and prove the fusion ring analogue. 


Question 6.1.3. Prove that a fusion ring R(B, M) @z Q, considered as an algebra over Q, 
is isomorphic to a direct sum of number fields. Construct these number fields explicitly, 
from the matrix S. (Hint: (6.1.7) may be helpful.) 


Question 6.1.4. Prove Theorem 6.1.5. 
Question 6.1.5. Classify all one- and two-dimensional fusion rings and modular data. 


Question 6.1.6. What happens to the modular data of the lattice example when the lattice 
is integral but not even (i.e. it has odd norm-squared vectors). 


Question 6.1.7. (a) Prove (6.1.13b). 

(b) Prove that if j = Jo is order n, then g;(a@) is an nth root of unity, and for n odd 
Tja JaTaa is also an nth root of 1, while for n even it is a 2nth root of 1. 

(c) Prove that the set of all simple-currents forms an abelian group (with respect to 
composition of the permutations J). 

(d) Prove that N: ie, P = Ne Describe oj and €,(/) of simple-currents, for any o € 


Gal(QlEn ]/Q. 


Question 6.1.8. Suppose alla € ® are simple-currents. Prove that any modular invariant 
is of the form (6.1.14). 


Question 6.1.9. Suppose we have four sets of functions, namely a;(z) and b;(z) (for 
1 <i <n), and c;(z) and d,(z) (for 1 < j < m), and they are all holomorphic in some 
common domain (e.g. the unit disc). Suppose the equality 


Yo ai(2)bi@) = GOGE 
i=1 j=l 


holds throughout that domain. Then n = m and there is an invertible n x n matrix M 
such that both 


ai(z)= >> Mijcj@), b= J (Mid). 
j=l 


j=l 
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6.2 Examples 
6.2.1 Affine algebras 


The mathematical riches of CFT go far beyond Lie theory, but CFT would have remained 
an esoteric part of mathematical physics, unknown to mathematics proper, if its deep 
connection to Lie theory hadn’t been discovered. 

The source of some of the most interesting modular data are the nontwisted affine 
Kac—Moody algebras g = X,‘’ (Section 3.2). We are interested in its integral highest 
weights à € P(g) with a given fixed level k € N. 

Recall that the g-character x(t) (3.2.11c) is essentially a lattice theta function, and 
transforms nicely under the modular group SL2(Z). In fact, the SL2(Z)-representation 
p of Theorem 3.2.3 defines modular data. The ‘vacuum’ is 0 = kwo, and the set of 
‘primaries’ ® are the highest weights P KC) given in (3.2.8). The matrix T is related to 
the eigenvalues of the second Casimir operator of g = X,, and S to elements of finite 
order in the Lie group of X, [333]: 


—i(p|p) tiA + plà +p) 
Thy = exp |e] exp eed Ôn u (6.2.1a) 
. (w(t + p)lv + p) 
Sw =a 7 det(w) exp | 201 KERV ; (6.2.1b) 
wew 

Sry : (A |i +p) 

~ = chs —2ri ————_ |]. 6.2.1 
re chio (= l Ti b+ he ( c) 


The unimportant number œ is given explicitly in theorem 13.8(a) of [328]. The inner- 
product is the usual Killing form of g, W is the (finite) Weyl group of g, p is the Weyl 
vector }*;_, @; and h“ is the dual Coxeter number (the sum )~;_, a,” of the colabels 
in Figure 3.2). Also, 2 denotes the projection 4,;@; +---+A,@,, and ‘cho is the 
appropriate finite-dimensional Lie group character. 

The combinatorics of Lie group characters at elements of finite order, that is the 
ratios (6.2.1c), are quite rich and have been studied by many people. For instance, [431] 
show that they lead to quick algorithms for computing, for example, tensor product 
multiplicities. Kac [327] used them in a Lie theoretic proof of quadratic reciprocity. 

For example, for A," at level k, we may take Pe = {0,1,...,k} (the value of 2), 
and then the S and T matrices and fusion multiplicities are given by 


2 1)(b+1 
Sab = | —— sin | x ep Oar!) ; (6.2.2a) 
k+2 k+2 
. 1 2 . 
Fee | ee (6.2.2b) 
2(k + 2) 4 
ifc= —b| < c < mi —a— 
ze 1 ifc = a+b (mod 2) and |ja—b| = c < minfa +b, 2k—a—b} : V 
0 otherwise 


For A,")) the matrix S is real and so charge-conjugationC = id. More generally, for X,® 
C corresponds to a symmetry of the Coxeter-Dynkin diagram of X,. For A1, there 
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(1,2) 0 ! ot) ! o a l Es 4D 
O e P200 Q-1) ] na | iF or 
"aS : AA : en a-d | bo G1 
1 ae wie 
(a) (b) © 


Fig. 6.6 Tensor and fusion products L(2, 0) & LG, 1) and L(0, 2, 0) K15L(0, 1, 1). 


is precisely one nontrivial simple-current, namely j = k, corresponding to Ja = k — a 
and g;(a) = (—1)*. More generally, to any affine algebra (except for E 3°) atk = 2), the 
simple-currents correspond to symmetries of the extended Coxeter—-Dynkin diagram. For 
A,” this symmetry interchanges the zeroth and first nodes, that is J(Agwo + A1@1) = 
à1wo + Aow, (recall a = A, and k = Ap + Aj). 

xw defined by (6.1.1b), are essentially the tensor product 
multiplicities TF := mult;g,,(V) for g (as opposed to the unrelated and less interesting 
tensor product multiplicities of g), except ‘folded’ in a way depending on the level k. 
This is seen explicitly by the Kac—Walton formula (see [328] page 288, [552], though 
there are other co-discoverers): 


The fusion multiplicities 


p, = J det(w) TER, (6.2.3a) 
wew 
where w.y :=w(y + p) — p and W is the affine Weyl group of X,“!’ (the dependence 
on k arises through this action of W). The proof follows quickly from (6.2.1c). This 
practical formula is also described in Section 16.2 of [131] and Section 4.9 of [553]. 
Equation (6.2.3a) looks more natural when viewed as follows. The Racah—Speiser 
formula (there are other co-discoverers) for tensor product multiplicities says 


Tr = > det(w) dim L(72),,, 5_x.- (6.2.3b) 
weW 
Combining (6.2.3) gives the ‘affinisation’ of Racah—Speiser: 


Ny, = X det(w) dim L@)y5-7- (6.2.3c) 
wew 


For example, the weights for the eight-dimensional A-module L(1, 1) are given in 
Figure 6.6(a). In Figure 6.6(b), we translate this weight space by p + A = (3, 1). Equation 
(6.2.3b) now tells us to Weyl-reflect each dot not in the Az alcove PŁ + p. Two of these 
dots are fixed by a Weyl reflection and so cancel themselves. Weight (4, —1) gets Weyl 
reflected to (3, 1) and so reduces the multiplicity there by 1. Shifting back by p = (1, 1), 
we thus get the tensor product 


L(@,0) @LG, 1) =LO, De LEOS LA, 2) @ LG, D. 
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The calculation of the A7‘” fusion multiplicity at, for example, level 2 (Figure 6.6(c)) 
is identical, except we now have extra Wey] reflections and the alcove is much smaller. 
The weight (4, 2) now lies outside the alcove, and reflects to (3, 1) where it reduces that 
multiplicity to 0. Thus we obtain the fusion product (writing the level as subscript) 


L(0, 2, 0) Kl »L(0, 1, 1) = LA, 0, 1). 


Equation (6.2.3a) has the flaw that, although the Ny are manifestly integral, it is 
not clear why they are positive. An open problem in the theory is the discovery of a 
combinatorial rule, for example, in the spirit of the well-known Littlewood—Richardson 
tule [217], for the affine algebra fusions. Such a rule for A, is conjectured in [88], 
although it is quite complicated even for A,“. 

Identical numbers V;’, appear in several other contexts, many of which we’ll see 
below. Because of these isomorphisms, we know that the M, defined by (6.1.1b) and 
(6.2.1b) do indeed lie in N, for any affine algebra, as predicted by RCFT. 

As mentioned before, the fusion product here is not the usual tensor product of affine 
algebra modules. However, the fusion product has been interpreted algebraically (with 
much effort) as a new kind of tensor product of affine algebra modules, in a series of 
papers by Kazhdan and Lusztig; it was proved equivalent to fusions in [190]. 

Fusion multiplicities arise in the quantum cohomology or Gromov—Witten invariants 
of Grassmannians [565], [57], often called the “quantum Schubert calculus’. Recall 
that ‘points’ in the projective plane consist of lines through the origin; more gen- 
erally, the Grassmannian Gr(m, n) consists of m-dimensional subspaces in R”. The 
(classical) Schubert calculus (see e.g. [217]) uses the cohomology ring of Gr(m, n) 
to solve problems in enumerative geometry such as ‘How many lines in projective 3- 
space P?(R) meet four given lines?’. On the other hand, the Gromov—Witten invariants 
count surfaces lying in the Grassmannian, which satisfy certain conditions (see e.g. 
[359]). The quantum cohomology ring (which counts spheres) of Gr(m, n) is isomor- 
phic to the fusion ring of gl„® = (u ® Am—1) at level (nm, n — m), ‘orbifolded’ 
with a ‘projection/field-identification’ given by the order-m simple-current (J~", J); the 
Gromov—Witten invariants are the fusion multiplicities. Now, there is a classical iso- 
morphism Gr(m, n) = Gr(n — m, n) (why?); this implies that there is a close relation 
(‘rank—level duality’) between the fusion rings of A, level k and Ay_;“ level r + 1. 
There are analogous rank—level dualities for the other classical algebras [428]. This is one 
of many symmetries of the g fusion multiplicities that has no analogue for the g tensor 
product multiplicities. Another example is that any symmetry of the extended Coxeter— 
Dynkin diagram is a symmetry of fusion multiplicities. In short, affine algebra fusion 
multiplicities are mathematically more interesting than their classical counterparts. 

We have long known that the representation theory of a Lie group G is related to 
K-theory. For example, the equivariant K-theory K as ™G(p) of the (trivial) action of 
G on a point p is the representation ring (over Z). The analogue of this for fusion 
rings is due to Freed—Hopkins—Teleman [193]: the fusion ring of X,® at level k is the 
twisted equivariant K-theory ” Karp) := HAY K dmG (G), where G is the compact 
simply-connected Lie group corresponding to X,, G acts on itself by conjugation and 
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kK+hY €Z=H È (G, Z) is the twist h. The strength of this important formulation is also 
its weakness: it pushes most technical difficulties under the carpet, but what remains is 
a clean conceptual characterisation of the fusion ring. 

Fusion multiplicities also arise as dimensions of spaces of generalised theta functions 
[179] (see also the discussion in [565]), as tensor product multiplicities in Hecke algebras 
at roots of 1 [255] and modular representations for, for example, the Lie algebra g for 
fields F, (see e.g. [392]). In Section 6.1.1 we give another appearance of the An 
fusion multiplicities. 

The Galois action for the affine algebras can be expressed geometrically using the 
action of the affine Weyl group on the weight lattice of X,. The parity €o (à) is quite 
interesting (see e.g. [7] for cohomological and number-theoretic interpretations). For a 
concrete example, consider A”: (6.2.2a) shows explicitly that Sap lies in the cyclotomic 
field Q[é4&+2)]. Write {x} for the number congruent to x mod 2(k + 2) satisfying 0 < 
{x} < 2(k + 2). Choose any Galois automorphism o = oy. Then if {€(a + 1)} < k +2, 
we will havea” = {€(a + 1)} — 1, while if {@(a + 1)} > k + 2, we will havea” = 2(k + 
2) — {€(a + 1)} — 1. The parity €, (a) depends on a contribution from Ja (which can 
usually be ignored), as well as the sign +1 or —1, respectively, depending on whether 
or not {£(a + 1)} < k +2. 

Affine algebra modular data corresponds to Wess—Zumino—Witten RCFT [245], where 
a closed string lives on a Lie group manifold G. The action is given by the sum of two 
terms: one is an integral over the world-sheet and corresponds to a so-called sigma model 
[343] of a bosonic field living on G; the other is a topological Wess—Zumino term, an 
integral over the volume bounded by the (compactified) world-sheet. Classically, the 
sigma model by itself would be conformally invariant, but quantisation breaks this. It 
was Witten who realised that conformal invariance would be retained if the Wess—Zumino 
term was added. For topological reasons the Wess—Zumino term comes with an integral 
prefactor (or coupling constant), which we call the level k. 

Why is the level k always shifted by the dual Coxeter number h“ in the formulae, 
and the weights by the Weyl vector o? The p-shift appears even for the simple finite- 
dimensional algebras (1.5.11), and arises from the combinatorics of geometric series. The 
algebraic explanation of the /-shift was given after (3.2.15). Physically, in the Wess— 
Zumino—Witten model, these p- and h’-shifts also arise automatically: the former as a 
quantum effect, due to normal-ordering or regularisation, much like the g!/** shift in 
the Dedekind eta; the latter as an effect of latent supersymmetry caused by decoupling 
fermions (see e.g. section 8 of [248], or [206]). 

The modular data (6.2.2) of A,” level k is related to the dilogarithm by the remarkable 
formula 


ee sè 
—— L{—*)=c — 24h, +6 6.2.4 
rae (5) e = Wha + 6a (6.2.4a) 
for eacha € PŁ, where c = 3k/(k + 2) is the central charge and ha = rae the con- 


formal weight (recall (3.2.9)). L(x) here is Roger’s dilogarithm, which for 0 < x < 1 is 
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given by 


œ 2 


1 
L«=>> = + 5log.x log (1 — x). (6.2.4b) 


n=1 


We put L(1) := limy_, ;-L (x) = 7/6. L(x) is strictly increasing, real-analytic, and obeys 
L(x)+ Ld — x) = L(1) and 


L(x) + LG) = Lary) +L (+ = z) +L G=) , (6.2.4c) 
y 


1-x 1— xy 


As was discovered by Lobachevsky and Schläffli in the nineteenth century, the dilog- 
arithm is related to volumes of tetrahedra, and several other appearances have been 
uncovered since. Equation (6.2.4a) is the tip of the iceberg; see [347] for several other 
identities and some history. (6.2.4a) can be proved by studying the t —> 0 asymptotics 
of certain character formulae. For a simple example, the two k = 1 A1® characters can 
be written 


M+N/)/2 
qí +N)°/ 


= (6.2.5a) 
MiP even (QM (GQ) 
M.NeN 


Xio+U—Doy(T) = 


where (q)y is the g-deformed factorial FE ,(1 — q”). Similar expressions exist for all 
other affine algebras and conjecturally all RCFT — see [14] for the state-of-the-art, and 
below for a conjecture. Actually, (6.2.4a) is obtained from the asymptotics of these 
character identities for certain non-unitary RCFTs, which have essentially the same S 
matrix as (6.2.2a). An explanation of some of these identities (at least mod 1) has been 
made by [164], who use the dilogarithm to express a natural map from H (SLR), Z) 
to R/Z. 

Choose any r x r rational positive-definite matrix A = A’, b € Q” and d € Q. Define 


im exp[2zit (n! An/2 + b'n + d)] 
fa,b,d(T) = om O l 


(6.2.5b) 


Conjecture 6.2.1 (Nahm [444]) Let A be any n x n rational positive-definite matrix. 
Then there are finitely many vectors bı, ... , Dm € Q” and numbers d, . . . , dm € Q such 
that the functions Xxi(T) := fA b: a; (T) are the entries of a vector-valued modular function 
for SL3(Z), iff these x;(t) are the graded-dimensions of the m primaries of some (not 
necessarily unitary) RCFT where d; = h; — c /24, iff there is a corresponding element 
of finite order in the Bloch group. 


The precise statement involving the Bloch group would take us too far afield, but see [444] 
for details. This beautiful conjecture has been verified only for r = 1 (which has three 
different A). A plausibility argument suggesting that RCFT characters should always be 
of that form involves considering their massive integrable perturbations [444]. Torsion 
in the Bloch group has known connections with modularity. 

The affine algebra g arises in the Wess—Zumino—Witten model, for the same reason the 
Virasoro does (recall the discussion around (4.3.4)): to each g € G we get a conserved 
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current, and its conserved charges define the level-k representation of g. As before, we 
get two commuting actions of g on the state-space H, recovering the finite decomposition 
(4.3.6b). 

For affine algebra modular data, the classification of modular invariants seems to be 
just barely possible, and the answer is that (generically) the only modular invariants are 
constructed in straightforward ways from symmetries of the Coxeter-Dynkin diagrams. 
For instance, consider A,": 


Theorem 6.2.2 [91] Recall that Pe = {0,1,...,k}, and the simple-current is given 
by Ja = k — a. Then the complete list of Ai“ modular invariants is 


k 
Arri =} xal? for all k > 1, 
a=0 
k k 
Dip = 2 Xa Xjea when > is odd, 
2 2 2 k : 
Deyn =|X0 + Xsol” + 1X2 + X21" + + 21xeI when = is even, 
Es =|xo + xel + 1x3 + x7 + 1x4 + Xol for k = 10, 
E7 =|xX0 + x16? + [xa + X121" + 1X6 + X10!" 
+ xs (x2 + x14)" + (x2 + x14) XE + x81? fork = 16, 


Es =|X0 + X10 + X18 + Xosl? + 1X6 + X12 + X16 + X22l for k = 28. 


A simple proof is given in [234]. The modular invariants A, and D,„ are generic, given 
by (6.1.14), and correspond respectively to the order 1 (i.e. identity) and order 2 (i.e. 
simple-current J) Coxeter-Dynkin diagram symmetries. Physically, A, and D, are the 
partition functions (4.3.8b) of Wess—Zumino—Witten models on the SU2(C) and SO3(R) 
group manifolds, respectively. The exceptionals E6 and Eg correspond to strings living 
on Sp4 and Gz manifolds, at level 1. The €7 exceptional is harder to interpret, but is the 
first in an infinite series of exceptionals involving rank—level duality and Dz, triality. 

Around Christmas 1985, Zuber wrote to Kac about the A,!) modular invariant prob- 
lem, and mentioned the modular invariants they knew at that point (what we now call 
A, and Deven). A few weeks later, Kac wrote back saying he found one more invariant, 
and jokingly pointed out that it must indeed be quite exceptional as the exponents of E6 
appeared in it. By summer 1986, Cappelli-Itzykson—Zuber found €7, Doaa and then Eg, 
and at some point recalled by chance Kac’s cryptic remark. They rushed to the library 
to find a list of the exponents of the other algebras, and were delighted to discover that 
they all matched. Thus the A-D-E pattern (Section 2.5.2) to their modular invariants 
was discovered! 

The modular invariants for A," realise the A-D-E pattern, in the following sense [91]. 
The (dual) Coxeter number h = h“ of the name ¥, equals k + 2, and the exponents m; 
of X, equal 1 plus those a € PŁ for which Zaa 4 O (for the algebras An, Dn, En, the 
integers m; are defined by writing the eigenvalues of the corresponding Cartan matrix 
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(Definition 1.4.5) as 4 sin?(4)), Probably what first led Kac to his observation about 
the E6 exponents was that k + 2 (this is how k enters most formulae), for his exceptional, 
equals the Coxeter number 12 for E¢. More recently, deeper connections between A— 
D-E and the A," modular invariants have been found, notably in subfactor theory 
(Section 6.2.6). This modular invariant classification, however, has never been directly 
reduced to the suggestion of Section 2.5.2. 

The modular invariants have also been classified, for example, for A>) [232], and 
they too seem quite interesting (Section 6.3.2). We are almost at the point where we 
can safely conjecture the complete list of modular invariants for X,.“ at any k, for X, 
a simple algebra (see e.g. [236]). The most surprising thing about these affine alge- 
bra modular invariant classifications is that there are so few surprises: almost every 
modular invariant is ‘generic’, that is constructable using a few simple uniform meth- 
ods such as Coxeter-Dynkin diagram symmetries. Unfortunately, the classification for 
semi-simple algebras X,, ® --- ® X,, does not reduce to that for simple ones, and will be 
hopeless. 

Has A—D-E been discovered in the other modular invariant classifications? No, only 
in those classifications trivially reducible to Theorem 6.2.2. There is, however, a rather 
natural way to assign (multi-di)graphs to modular invariants, generalising the A-D—E 
pattern for Aı™®. It is called a NiM-rep, and is a representation of the fusion ring by 
nonnegative integer matrices. More precisely, for each weight a € Pe (A) we want a 
nonnegative integer matrix Ma such that 


k 
MaMo = X Née, Me, (6.2.6) 


c=0 


where NG, are the fusion multiplicities of (6.2.2c). We also require Mọ = J, and all these 
matrices to be symmetric: Ma = (M,)'. In Question 6.2.2 you are asked to find all such 
assignments a œ> Ma. Surprisingly, there is a near-perfect correspondence between 
the A,“ modular invariants, and these NIM-reps. Physically, NIM-reps are associated 
with boundary conformal field theory or D-branes in string theory. See [47], [236] and 
references therein for the basic theory and examples of NiM-reps. They are an integral 
part of the combinatorial data of RCFTs. However, the simplicity of the correspondence 
for A; is an accident due to the small size of the relevant Perron—Frobenius eigenvalue 
here. In particular there appear to be far more Nim-reps for A2% than modular invariants. 

Hanany—He [271] suggest that the A; A-D-E pattern can be related to subgroups 
G C SU2(C) by orbifolding four-dimensional N = 4 supersymmetric gauge theory by 
G, resulting in an N = 2 superconformal field theory whose ‘matter matrix’ can be read 
off from the Coxeter-Dynkin diagram corresponding to G. The same game can be played 
with finite subgroups of SU3(C), resulting in N = 1 superconformal field theories whose 
matter matrices resemble the NiM-reps of A2”. [271] use this to conjecture optimistically 
a McKay-type correspondence between singularities of type C”/G, for G C SU,(C), 
and the modular invariants of A,_;"!). This in their view would be the form A—D-E takes 
for higher-rank modular invariants. Their conjecture is still too vague to be probed. 
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So far we have considered only integrable modules, which are necessarily at level 
k € N. But their modular behaviour can be mimicked at certain fractional levels, by the 
so-called admissible modules [335]. It is tempting to guess that there should be natural 
CFT and VOA interpretations for these, analogous to the integrable ones. The matrix S 
there is symmetric, but has no column of constant phase and thus naively putting it into 
Verlinde’s formula (6.1.1b) will necessarily produce some negative numbers (it appears 
that they’ll always be integers though). A legitimate fusion ring has been obtained for 
A,“ at fractional level in other ways [26], [184], and initial steps for A> have been 
made in [221]. VOA interpretations for A,“ admissible modules are given in [2], [148]. 
Serious doubt, however, on the relevance of these efforts has been cast by [225], [378]. 
Sorting this out is a high priority. 

Related roles for other Kac—Moody algebras are slowly being found. The twisted affine 
algebras also have modular-like data, and arise naturally in the data for NIM-reps [58], 
[226]. Lorentzian Kac—Moody algebras have been proposed [171], [285] as the sym- 
metries of ‘M-theory’, the conjectural 11-dimensional theory underlying superstrings. 
Relations between strings and Borcherds—Kac—Moody algebras are discussed in [275], 
[276], [134]. 


6.2.2 Vertex operator algebras 


Let V be a ‘nice’ VOA (more on this shortly). The primaries a € ® label the finitely many 
irreducible V-modules M“. The relation between VOAs and SL2(Z) given in (4.3.9) was 
anticipated by RCFT, and proved by Zhu (Theorem 5.3.8). It gives (among other things) 
the modular matrices § and T. Do they define modular data? If so, does Verlinde’s 
formula (6.1.1b) compute the dimensions of intertwiner spaces (6.1.17)? 


Definition 6.2.3 By a rational vertex operator algebra (RVOA) we mean a weakly ratio- 
nal vertex operator algebra V (Definition 5.3.2) obeying in addition 
(i) V is simple (that is is an irreducible module for itself) and the contragredient V* is 
isomorphic to V as a V-module; 
(ii) Mo = {0} for all irreducible modules M # V; 
(iii) every N-graded weak module is completely reducible; 
(iv) V is C2-cofinite (Definition 5.3.5). 


C>2-cofiniteness is a technical condition with many consequences. As we know, every 
VOA is a module for itself; the contragredient of a module is discussed around (5.3.4a). 
In any unitary RCFT, all conformal weights hy, a € ®, are positive except for a = 0, so 
condition (ii) is then automatic. Condition (iii) is a little stronger than the usual complete 
reducibility requirement. 

This use of the term ‘rational’ is not standard, and different definitions of ‘RVOA’ can 
be found in the literature (some of these are listed in appendix A of [224]). But the term 
‘rational VOA’ should be limited to those VOAs that possess some variant of modular 
data. The justification for our use of the term is the following recent theorem: 
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Theorem 6.2.4 (Huang [297]) Let V be a VOA, rational in the sense of Defini- 
tion 6.2.3. Let ® label its (finitely many) irreducible modules, let Nj, be the dimension 
of the space V( 5) of intertwiners, and let S be the matrix defined in Theorem 5.3.8, 
satisfying (4.3.9a). Then Verlinde’s formula (6.1.1b) holds and S is symmetric. Also, the 
category Rep V of V-modules has a natural structure as a modular category. 


The objects of the category Rep V are V-modules, and the morphisms are V-module 
homomorphisms. A modular category is described in Section 6.2.5 and is (among many 
other things) a braided monoidal category. Theorem 6.2.4 is a corollary to Huang’s pro- 
gramme of constructing geometric VOAs (Section 5.4.1) in genus < 1 from an algebraic 
VOA. It appears that additional minor conditions on the VOA Y will be needed [296] in 
order that the higher-genus chiral blocks be constructed — once identified, these restric- 
tions should be included in the definition of rationality for VOAs. Extending this work 
to genus > 1 would be the final step in associating a modular functor — that is, a chiral 
half of an RCFT, including all the Moore—Seiberg data — to a nice VOA. 

Equation (6.1.1b) can be defined only if all Smo 40, so Theorem 6.2.4 certainly 
implies that. Some RVOAs (e.g. those associated with non-unitary RCFTs) won’t possess 
modular data in the narrow sense of Definition 6.1.6. However, suppose in addition to 
being rational that V has the (common) property that any irreducible module M 4 V 
has positive conformal weight hy (recall hy — c/24 is the smallest power of q in the 
Fourier expansion of the graded dimension x y(t) = q7% ye q+"), This holds 
for instance in all VOAs associated with unitary RCFTs. Then consider the behaviour of 
xu(t) for t + 0 along the positive imaginary axis: since each Fourier coefficient a” 
is nonnegative, x(t) will go to +00. But this is equivalent to considering the limit of 
Yon Sun Xn(T) as T > iœ along the positive imaginary axis. By hypothesis, this latter 
limit is dominated by Smo alq“ 24 at least when Smo Æ 0. So what we find is that, 
under this hypothesis, the 0-column of S consists of nonnegative real numbers (and also 
that the central charge c is positive). But Verlinde’s formula certainly requires that all 
numbers in the 0-column of S be nonzero. Thus we get: 


Corollary 6.2.5 Suppose V is a rational VOA and for all irreducible modules M, 
M, = 0 for all n < 0. Then (4.3.9) (more precisely Theorem 5.3.8) define modular 
data. 


Of course the affine algebra modular data discussed in Section 6.2.1 is a special case of 
that considered here, corresponding to the integrable affine VOA V(g, k) constructed in 
Section 5.2.2. 

Verlinde’s formula (6.1.1b) is only a genus-0 special case of (6.1.2). What makes the 
proof of Theorem 6.2.4 difficult is the difficulty in constructing chiral blocks in genus 
> 0. At the time of writing, only special cases have been worked out in arbitrary genus 
(see, e.g., theorem 6.2 in [573]). Moore—Seiberg bypassed this difficulty by assuming 
the chiral blocks all exist and have all the required properties. 

As mentioned in Section 5.3.5, one direction Huang’s Theorem could possibly 
be extended is to ‘quasi-rational’ CFT [436]. These are VOAs with infinitely many 
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irreducible modules, but with finite fusion products (5.3.3). They would correspond 
to a ‘C,-cofiniteness’ condition and typically have infinite-dimensional Zhu’s algebra. 
The easiest example is the Heisenberg VOA (5.2.5), associated with the oscillator alge- 
bra u; (3.2.12). We find directly from (3.2.12c) that the graded dimension of V (A) 
obeys 


x(t +1) = ey, (0), (6.2.7a) 
CO 

x%(-1/t) = f eiM y (tr) du. (6.2.7b) 
—oo 


In other words, on the Hilbert space L?(R) of square-integrable functions f (œ), let S(f) 
be the Fourier transform of f, and T (f) the function given by 


T (f(a) = A- f(a) 


Then S and T define a unitary representation of SL2(Z) on the space L?(R) spanned by 
the x, (more precisely, they act on the space of functions x s(t) = ‘bee f(@) xa(t)da 
for f € L?(R)). In Verlinde’s formula (6.1.1b), the sum over ® becomes an integral over 
R, and yields the distribution 


u =w- p), 


in other words L(A) [x] L(u) = L(v), so the ‘fusion ring’ L? (R) is given a convolution 
product. 

It can be hoped that this modular behaviour would be typical for a wide class of 
other quasi-rational theories. The generalisation of Zhu’s Theorem 5.3.8 and Huang’s 
Theorem 6.2.4 to such quasi-rational theories would be wonderful to see. 

Modular invariants have a VOA interpretation. Let M“ and M" be the irreducible 
modules of RVOAs V C V' sharing the same conformal vector w. Then each M” is a 
Y-module. An RVOA is completely reducible, so each M’! should be expressible as a 
direct sum of M“’s — these are called the branching rules. The sum of J` 5-4, | Ke 
invariant under that SL2(Z)-action; rewriting the x ve ’s there in terms of the xma’s via 
the branching rules yields a nontrivial modular invariant for V. 

For instance, the VOA L(wọ)' corresponding to the affine algebra G2“! at level 1 
contains the VOA L(28w9) = L(0) for A,“ at level 28. We get the branching rules 


|? is 


L(@o) = L(0)  L(10) $ L(18) 6 L (28), 
L(@2) = L(6) @ L(12) @ L(16)  L(22). 


Thus the Z’ = I modular invariant for Gy” level 1 yields the A, modular invariant 
Eg in Theorem 6.2.2. 

So knowing the modular invariants for an RVOA Y gives considerable information 
concerning its possible ‘nice’ extensions V’. For instance, we are learning from this that 
the only finite extensions of a generic integrable affine algebra VOA are those studied in 
[147] (‘simple-current extensions’), and whose modular data is given in [212]. 
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6.2.3 Quantum groups 


The chiral data of affine algebras and Wess—Zumino—Witten models is also recovered 
by quantum groups (deformations of the universal enveloping algebra U (g)), though the 
reasons are still somewhat mysterious (i.e. indirect). 

Over the years large numbers of two-dimensional models in statistical mechanics were 
found that are exactly solvable (completely integrable). Gradually it became clear that 
the underlying reason was the so-called (quantum) Yang—Baxter equation [394]: 


R? RE R? = R? R? R!?, (6.2.8) 


where R: V & V — V & V is linear and where, for example, RB:VOV 8V > 
V @V @V sends v: 8 v2 8 v3 E V @V @V to}; ai @ v2 @ bi, where R(v; @ v3) = 
X; ai Q bi. (Generalisations of (6.2.8) exist but this is enough for us.) The Yang- 
Baxter equation should make us think of braids (recall Figure 1.29) and indeed an easy 
result is: 


Proposition 6.2.6 Given a solution R to (6.2.8), we obtain a representation of the 
braid group B, on V Q --» Q V (n times) by sending the braid generator o; to (tR)'*", 
defined by (TR +! (v @ +++ @ Up) = V1 @ ++ via @ (YI bj @ aj) @ vi DD Uy, 
where R(v; Q vj41) = Des aj ® bj. 


The ‘transpose’ t in Proposition 6.2.6 is the flip of the two copies of V; we see it again 
in Definition 6.2.8. The reader should try to prove the proposition, but it’s also proved 
in section 15.2A of [98]. 

We are interested in families R = R(q) of solutions to (6.2.8), depending on a complex 
parameter q. Write q = e". If we Taylor expand R(e'”) = So Arn and retain only 
the first-order terms in A, we obtain the classical Yang—Baxter equation for r := r1: 


Fae cea Re cel Casa cael r’, ca oO (6.2.9) 


Being a sum of commutators, it’s reminiscent of Lie algebras and indeed Lie theory 
provides classes of solutions [98], [394]. Roughly, quantum groups were proposed by 
Drinfel’d and Jimbo around 1985 as a Lie-like symmetry underlying (6.2.8), that is, as 
providing a way to solve the quantum Yang—Baxter equation using g-deformations of 
Lie theory. 

The idea of deformations [279] is a beautiful one. For example, consider n-space R” 
and fix a vector q € R” (the ‘deformation parameter’). Define the new multiplication 
by scalars to be k -, x := kx + (1 — k)g and vector addition to be x +4 y := x+y —q 
(where the operations on the right sides are the usual R” ones). The zero-vector here 
is 04 := q. This defines a new vector-space structure on the same underlying space. 
However, it is of course isomorphic (as a vector space) to the original one, since the 
dimension hasn’t changed. 

The finite-dimensional complex semi-simple Lie algebras g are also rigid in this sense 
(see Question 6.2.3(b)). However, nontrivial deformations of their universal enveloping 
algebras U (g) (Section 1.5.3) do exist. 
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Consider for concreteness g = Aj, with basis e, f, h of (1.4.2b). Define 


h_ „—h 

le, fl= Í 1 , (6.2.10a) 
q-4 

q'e = q’eq", (6.2.10b) 

CPS q° fa". (6.2.10c) 


Here by, for example, ‘q! { 


define the quantum group U,(A}), a one-parameter deformation of U (A1). Given this, 
we get a solution R(q) to (6.2.8): 


we mean the Taylor expansion in powers of h. These equations 


Co naan (1 — q7?" ae n neh 
R@) =) oq? ————@"ey" @q"fyre?, (6.2.10d) 
n=0 [7] 4: 
where |n]! = [n]aln —1Jq--- Lll for Lk] = (q¥ — q~")/(q — q7!). Nevertheless, 
these equations look random and opaque (to this author at least). The next few paragraphs 
aim to make some sense out of them. 


Definition 6.2.7 Let k be a ring (take k = C if this generality makes you uncomfort- 

able). A Hopf algebra A is: 

(i) An associative algebra over k with unit 1 and multiplication m. 

(ii) A co-associative co-algebra over k, i.e. with co-multiplication A: A> A & A 
and co-unit € : A > k. 

(iii) The algebra and co-algebra structures are compatible, i.e. A and € are algebra 
homomorphisms, and u and 1 (regarded as a map ı : k — A sending x +> x1) 
are co-algebra homomorphisms. 

(iv) A has a map S : A —> A, called the antipode, which obeys 


wolid@S)yoNA=loe=H=po(S @idjyod. 


We’ve seen ‘algebra’ before. A Hopf algebra may or may not be commutative as an 
algebra. A ‘co-algebra’ is an ‘algebra with the arrows reversed’: just as an algebra has a 
bilinear map A ® A — A (multiplication), so aco-algebra has a linear map A > A & A 
(co-multiplication), and similarly for unit and co-unit. 

Perhaps [51] or the introduction to [398] can help make this definition seem more 
natural. Hopf algebras are algebras with a rich representation theory. If M, N are modules 
of a generic algebra A, then their usual vector-space tensor product M @ N always has a 
natural structure as an A ® A-module, but generally not an A-module. But if A has a co- 
product, we get the A-module structure by the formula a.(m @ n) := A(a).(m @ n). The 
antipode converts left modules into right modules, and is used to define the representation 
M* dual to a given representation M . It plays the role of inverse in the algebra. See also 
Question 6.2.4. 

For example, a universal enveloping algebra U(g) forms a Hopf algebra with co- 
product given by A(x) = x 8 1+ 1&8 x forx € gand AC) = 1 @ 1; co-unit e(x) = 0 
for x € g and e(1) = 1; and antipode S(x) = —x for x € g and S(1) = 1. In a similar 
way, the space F (G) of functions on a Lie group G is also a Hopf algebra (in fact a dual 


380 Modular group representations 


of U(g)). U (g) is co-commutative, whereas F (G) is commutative; in fact, these U (g) 
are the only co-commutative, and F(G) the only commutative, Hopf algebras (modulo 
certain technical assumptions). This is in fact why Drinfel’d [160] cooked up the name 
“quantum group’ for these g-deformations. U,(g) is a non-co-commutative deformation 
of U (g), so we could imagine that just as the dual of U (g) consists of the functions on 
a group G, the dual of U,(g), which will be a non-commutative Hopf algebra, should 
correspond to something like the functions on a group-like object G4, which would be 
some sort of g-deformed version of G. This picture is in the same spirit as Connes’ 
non-commutative geometry. In any case the term ‘quantum group’ has inappropriately 
slipped from G; to apply directly to U,(g). 

The co-product, etc. for these U,(g) are explicitly given in proposition 6.5.1 of [98] 
in full generality. Although U, (g) is not co-commutative, it is nearly so: 


Definition 6.2.8 A quasi-triangularisable Hopf algebra A is a Hopf algebra with 
invertible element R € A Q A such that t(A(a)) = R Ala) R7! for all a € A, as well 
as 


(A @id\(R) =RPR?” €c ABABA, 
(id @ A(R) =RERV EC ABABA. 


This element œR is called the universal R-matrix (or braiding) of A. Of course if A is 
co-commutative, then R = 1 @ 1 works. The point: the element R satisfies the quan- 
tum Yang—Baxter equation (6.2.8). This is the origin of the word ‘triangular’ in Defini- 
tion 6.2.8: an alternate name for the Yang—Baxter equation is the star—triangle relation. So 
given any representation of A, R maps to a matrix satisfying (6.2.8) — this representation- 
independent aspect of R justifies the word ‘universal’. Any non-co-commutative quasi- 
triangularisable Hopf algebra is now called a quantum group. 

Drinfel’d [160] found a remarkable way, independent of the Yang—Baxter equation, to 
construct quantum groups from any Hopf algebra A. The quasi-triangular Hopf structure 
is put on the space A ® (A*)?, where (A*)®? is the dual Hopf algebra A* except that 
its co-multiplication is changed from A* to its transpose t o A*. A nice discussion 
is in [480]; a general categorical interpretation is the ‘centre construction’ [338]. In 
particular, the quantum group U,(g) of (6.2.10) arises as a simple quotient of the quantum 
double of U,(B*), where B* is the Borel subalgebra of g, generated by h; and e;. See 
section 4.6 of [207] , where this is discussed very explicitly. The point is that U,(B*) is 
very easy to understand, so this gives an explicit way to compute R for U; (g). 

As usual we’re interested in representation theory. Recall that the modules of A, 
and U (A) are identical. There is only one one-dimensional A-module: everything 
gets sent to 0. However, there are exactly two one-dimensional representations of the 
quantum group U,(A1): e.v = f.v = 0 and q'.v = +v. Call these Y+. Wy is just the 
deformation of the trivial U(A,)-representation, but y_ has no classical (i.e. q > 1) 
analogue. The existence of w_ is the only difference between the representation theory 
of U,(A,) and U (A1) (or Aj): every finite-dimensional irreducible U,(A;) module is 
uniquely expressible as the tensor product of a one-dimensional representation Y+ with 
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some highest-weight representation L,(m), for m € N, where L,(m) is a deformation 
of L(m) with the same Weyl character. This generalises to any U; (g). 

We’re more interested in U, (g) ‘at a root of unity’. The meaning of this is very subtle, 
but is explained very thoroughly in chapter 9 of [98] (we are interested in their second 
construction, the ‘restricted integral form’ U gD) which is a quotient of U,(g)); see 
also [10], [392]. The representation theory is also subtle, and most treatments (e.g. that 
of [98]) assume from the start that the order of the root of unity must be odd. See, for 
example, [392], [10] for their modules. There are now indecomposable modules that are 
not irreducible, a common situation in algebra (recall Question 1.1.6). The trick of how to 
proceed was discovered by physicists: throw the sick modules away! In particular, when 
we evaluate the Weyl characters at the root of unity q, the result is called the quantum 
dimension of the module. We keep those modules with nonzero quantum dimension, and 
discard the others. This prescription works because the direct product of any U;“(g)- 
module with any sick one is a direct sum of sick ones. We can call this ‘the reduced 
representation ring of the quantum group U,(g) specialised to the root of unity q’. See 
section 4.5 of [207] for examples (though note that his q is the square of ours). 

The result is somewhat surprising: this reduced representation ring, for q = e7#/"&+") 
(where m is defined below), is isomorphic to that of the fusion ring of g® at level k 
[190]. Here, m = 1 for g = A,, D,, Eo, E7, Eg; m = 2 for g = B,, C,, F4; and m = 3 
for g = G2. 

More generally, much of the chiral data of the Wess—Zumino—Witten theories are 
recovered by the corresponding quantum group at a root of unity [253], [207]: along 
with the fusion multiplicities, also the braiding and fusing matrices of Section 6.1.4, and 
the associated knot invariants of Section 6.2.5. Explanations for these ‘coincidences’ are 
given in, for example, chapter 11 of [253], but they are all unsatisfying in that they are 
so indirect. 


6.2.4 Twisted #6: finite group modular data 
In many respects, a finite group G behaves much like a compact connected Lie group, and 
so we may hope that they possess an analogue of Section 6.2.1. Indeed that is beautifully 
the case. 


For any finite group G (Section 1.1), let K;,..., Kp be its conjugacy classes, and 
write k; for > gek, © CG. These k;’s form a basis for the centre of CG. Write 
kikj = 5 cfke; (6.2.11a) 
£ 


then the structure constants cf, are nonnegative integers, and we obtain 


K;|| || K; h(g;) ch(g;) ch 
og = LILIK; yo cht) ce YEH, aus 
IG] chelrrG ch(e) 


where g; € K;. This resembles (6.1.1b), with Sap replaced by S; ch = ch(g;) and the 
vacuum 0 by the identity e. Unfortunately, the other axioms of modular data fail. 
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However, the group algebra CG is a Hopf algebra, with co-multiplication A(g) = 
g Q g, co-unit e(g) = 1 and antipode S(g) = g~!. The way to obtain true modular data 
is to take the quantum double of CG. Its Hopf dual, the space F [G] of functions G > 
C, is also a Hopf algebra, for example, with co-product A(f)(g1, 82) = f(gig2). The 
construction of the double D(G) is described nicely in [406]; we will simply describe 
its modular data. 

Let ® be the set of all pairs (a, ch), where the a are representatives of the conjugacy 
classes of G and ch is the character of an irreducible representation of the centraliser 
Cg(a). (Recall that Cg(a) is the set of all g € G commuting with a.) P parametrises the 
irreducible modules of the double D(G). Put [393], [136] 


1 = 
S a,ch),(a',ch) = ch’(g~!ag) ch(ga’g~'), (6.2.12a) 
OEM ICEA Coa] oe 
ch(a) 
T(a,ch),(a’,cl’) = Baa’ Schich’ heey (6.2.12b) 


where G(a, a’) = {g € G |aga'g™! = ga'g-'a} and e € G is the identity. For the 


‘vacuum’ 0 take (e, 1). Then (6.2.12) is modular data. Manifestly, N-valued descrip- 
tions of the fusion multiplicity Nee tas exist (see section 2 of [391], who realises 
the fusion ring as the Grothendieck ring for G-equivariant vector bundles). For Lusztig, 
(6.2.12) arose in his determination of irreducible characters of Chevalley groups. The 
higher-genus fusion multiplicities in (6.1.2) also have interpretations as multiplicities of 
representations of D(G) in D(G) ® --- ® D(G) [35]. 

For instance, the modular data associated with the finite group S3 is 


1 1 2 2 2 2 3 3 
1 1 2 2 2 ot <3 § 33 
2 <2 4 <2 =2 =2 0 0 
172 2 -2 4 —2 —2 0 0 
a GN D2 Oe aD Oo AG O OM? (6.2.13a) 
2 2 +2 —2 4 2 0 0 
3 -3 0 (0) 0 0 By 223 
3 -3 0 0 0 0 —3 3 
T= diag(1, 1, 1, 1, ers ert 1, —1). (6.2.13b) 


See [115] for several more explicit examples. 

This modular data can be twisted [138], [135], [34], [115] by a 3-cocycle œ € 
H3(G, C*). Indeed this twisted modular data is absolutely as fundamental as (6.2.12) — 
recall the discussion in Sections 4.3.4 and 5.3.6. This cocycle œ plays the same role here 
that level does in affine algebra modular data, as H*(G, C*) = Z when G is simply- 
connected and simple. This sort of twist has a generalisation to arbitrary chiral data 
[118]. 

One of the remarkable features of affine algebra modular data — its ubiquity — is 
shared by finite group modular data. Most important for us, it arises in the orbifold of 
holomorphic VOAs (recall Section 5.3.6). Let G be a finite group of automorphisms 
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! 4 i W : Ne 
r D4 r b x< g X 
Fig. 6.7 Colourings at a crossing. 


of a holomorphic VOA Y -— all finite groups arise in this way (Question 6.2.7). Let 
V° be the space of fixed points of G; it inherits a VOA structure from V. Then 
the modular data of V is trivial but that of VC is expected to be (6.2.12) or some 
twisted version (see Conjecture 5.3.10). This modular data also appears in the crossed- 
product construction in von Neumann algebras (Section 6.2.6). In physics, it arises in 
(2 + 1)-dimensional Chern—Simons theory with finite gauge group G [138], [194], as 
well as (2 + 1)-dimensional quantum field theories where a continuous gauge group has 
been spontaneously broken to a finite group [31] (adding a Chern—Simons term here 
corresponds to the cohomological twist). 

This modular data is quite interesting for nonabelian G, and deserves more study. It 
seems very effective at distinguishing groups — in fact, itis known to distinguish all groups 
of order < 128. Conversely, there are non-isomorphic groups of order 2!> . 34.5.7 
with identical modular data up to reordering primaries [175]. Finite group modular data 
behaves very differently from the affine algebra data (see e.g. [115], [457], [178]). For 
instance, Eiichi Bannai has found that the alternating group As, which has only 22 
primaries, has a remarkably high number (8719) of modular invariants. By contrast, 
affine algebras have relatively few modular invariants. 


6.2.5 Knots 


The Jordan curve theorem states that all knots in R? are trivial. Are there any nontrivial 
knots in R?? 

In Figures 1.9 and 1.10 are some knots in R?, flattened into the plane of the paper. A 
moment’s consideration will confirm that the second knot of Figure 1.9 is indeed trivial. 
What about the trefoil? 

A knot diagram cuts the knotted S! into several connected components (arcs), whose 
endpoints lie at the various crossings (double-points of the projection). By a 3-colouring, 
we mean to colour each arc in the knot diagram either red, blue or green, so that at each 
crossing either one or three distinct colours are used. For example, the first two colourings 
in Figure 6.7 are allowed, but the third isn’t. By considering the ‘Reidemeister moves’ 
(Figure 1.12), which tell us how to move between equivalent knot diagrams, different 
diagrams for equivalent knots (such as the two in Figure 1.9) are seen to have equal 
numbers of distinct 3-colourings. Hence, the number of 3-colourings is a knot invariant. 

For example, consider the diagrams in Figure 1.9 for the trivial knot: clearly, all arcs 
must be given the same colour, and thus there are precisely three distinct 3-colourings. 
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Fig. 6.8 The Wirtinger presentation of the knot group. 


On the other hand, the trefoil has nine distinct 3-colourings — the bottom two arcs of 
Figure 1.10 can be assigned arbitrary colour, and that choice fixes the colour of the top 
arc. Thus the trefoil is nontrivial! 

Essentially what we are doing here is counting the number of homomorphisms g from 
the knot group mı (R? \ K) of knot K to the symmetric group S3. The reason is that any 
(oriented) knot diagram gives a presentation for 7r (R? \ K), where there is a generator 
x; for each arc and a relation of the form x;x ; = xgx; for each crossing (Figure 6.8). See 
section 3.D of [478] for more details and a proof. For example, the knot group of the 
right knot of Figure 1.9 has presentation 


(X15 6065 X7 |X XQ = X4X1, X5X1 = X3X5, X5X4. = X3X5, X2X1 = X5X2, 
X2X7 = X2X2, X2X7 = X6X2, X2X5 = X6X2), 


which is isomorphic to Z. By contrast, the knot group of the trefoil is 6; (Question 6.2.8). 
Incidentally, the complement R? \ K of a knot determines the knot, and the extent to 
which the knot group determines the knot is also understood (see section 1 of [61]). 
Therefore, in this sense the trefoil and 53 are intimately connected (recall Section 2.4.3). 

S3 is generated by the transpositions (12), (23), (13). The homomorphism ¢ : mR? \ 
K) — Sis defined using, for example, the identificationr <> (12), b <> (23), g <> (13), 
and the above 3-colouring condition at each crossing is equivalent to requiring that g 
obeys each relation in the Wirtinger presentation. Our homomorphism ¢ will be onto 
iff at least two different colours are used. By considering more general (non-abelian) 
colourings, the target (S3 here) can be made to be any other group G, resulting in a 
different knot invariant. 

In the early 1980s, knot theory was dormant; by the late 1980s it was flourishing. But 
as a consequence, we suddenly had too many knot invariants. Reshitikhin and Turaev 
[473] brought order to this chaos, by proving that whenever we have a ribbon category V, 
we get invariants of (framed) knots and links, that is of knotted and linked ribbons. The 
reason for their result, as we explain in Section 1.6.2, is the universality of the topological 
category Ribbon of ribbons (Theorem 1.6.2). Given any knotted link, coloured with the 
objects of V, their functor associates the link with some morphism Hom(@, Ø) of V, 
and isotopic links get assigned the same morphism. This morphism is the desired link 
invariant. For example, the 3-colouring invariant comes from aribbon category associated 
with the modular data (6.2.13). 

We can express their result slightly differently. Suppose we have a representation 
of every braid group B, (e.g. Proposition 6.2.6 says we get this from a solution to 
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Fig. 6.9 The Sap and Taa matrix entries in modular categories. 


the quantum Yang—Baxter equation). To every braid we get a link by closing it up, as in 
Figure 1.14. Unfortunately, different braids can get assigned the same link. As we explain 
in Section 1.2.3, the two Markov moves capture precisely this redundancy. Thus we get 
a link invariant from our braid representations if we can construct a quantity invariant 
with respect to these two moves. The first move 8’ <> 6A’ B~! suggests we assign to the 
braid the trace of its representing matrix; unfortunately, that usually won’t respect the 
second move, B <> BT*!. 

However, [473] explain how to enhance the braid representation coming from 
any quasi-triangularisable Hopf algebra (Definition 6.2.8), to get link invariants. See 
section XI.3.1 of [534] for details. Thus, combining their construction with the Drinfel’d 
double, which associates a quasi-triangularisable Hopf algebra with any Hopf algebra, 
we can construct (or recover) enormous numbers of link invariants. 

So far we have discussed invariants of links embedded in R? (equivalently, S*). Much 
more difficult is to construct invariants of links in arbitrary 3-manifolds, but it is precisely 
this that is relevant to our story. There are (at least) two ways to do this: one uses ‘Dehn 
surgery’ to construct the manifold from S? [474], and the other uses triangulation by 
tetrahedra [535]. We allude to the Turaev—Viro theory [535] elsewhere. In the early 1960s 
Lickorish and Wallace established that any closed compact oriented 3-manifold M can 
be obtained by surgery on the 3-sphere S? along some framed link L (see section II.2.1 
of [534] for details). The idea is to construct an invariant for M from the link invariant of 
L in S°. For instance, the 3-manifold S! x S? arises from S? by surgery along the trivial 
ribbon. The problem is that different links give rise to the same manifold. However, this 
redundancy is completely captured by the so-called ‘Kirby moves’ (see section II.3.1 of 
[534] for details). Once again, Reshitikhin and Turaev [474] find the necessary refinement 
to ribbon categories, as well as the precise expression for the 3-manifold invariant, 
which will make the quantity invariant under the Kirby moves. The result is called a 
modular category (see chapter 2 of [534] for complete details). Roughly speaking, it is 
a ribbon category with the additional property of direct sum, with a finite set of ‘simple 
objects’ (closed under x) and a complete reducibility property, whose Hopf link invariant 
(Figure 6.9) is nondegenerate. More generally, this procedure gives us link invariants in 
any 3-manifold. Again, the ultimate source of these topological invariants is a universality 
property of the appropriate topological category. All of these universalities have as their 
source the universality of Braid for braided monoidal categories (Theorem 1.6.1). 

Any RCFT gives a modular category (in fact two of them, one for each chiral half). For 
an RCFT, the simple objects are the objects that are the chiral primaries, the monoidal 
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structure is the fusion product and duality is charge-conjugation. Modular data is obtained 
directly from the Hopf link and twist, as in Figure 6.9. There are thus three different 
incarnations of the S-matrix in RCFT: the modular transformation (4.3.9a), Verlinde’s 
formula (6.1.2), and the Hopf link. In fact, the notion of a modular category is equivalent 
to that of Segal’s modular functor (Section 4.4.1) [534], [32]. For a sufficiently nice VOA 
Y, the simple objects are the irreducible V-modules. The 3-colouring invariant of Figure 
6.7 comes from a holomorphic orbifold VOA, and as such can be modified to yield a 
link invariant in any 3-manifold. 

For instance, we get S? knot invariants from the quantum group U, q(X,) with generic 
parameter, but to get invariants for any closed 3-manifold requires specialising q to a 
root of unity. Modular categories are far less common than ribbon categories, but they 
can be obtained by an analogue of the Drinfel’d double. 


6.2.6 Subfactors 


The final general source of modular data that we discuss is from subfactor theory. The 
relations of subfactors to knots is reviewed in, for example, [317], [318], [319], while 
reviews of the relation between subfactors and CFT can be found in [177], [66]. 

Recall the definitions in Section 1.3.2. Let N C M be an inclusion of type II, factors. 
We call N a subfactor, provided N includes the identity of M. Jones’ motivation for 
looking at subfactors came from their formal similarity with Galois theory. After all, the 
very notation dimy (H) for the ‘coupling constant’ of Section 1.3.2 suggests thinking of 
a type I; factor as a non-commutative analogue of ‘field of scalars’. 

In particular, let G be a finite group acting on some type II, factor N. Then the 
crossed-product N xG is also a type II, factor, iff each g € G, g #e, is ‘outer’. By an 
outer automorphism g of N we mean that there are no unitary operators u € N such that 
g.x = uxu* forallx € N. Any locally compact (e.g. finite) group G acts on, for example, 
the hyperfinite type Il; factor by outer automorphisms, so this isn’t a major restriction. 
This yields a Galois correspondence between subgroups H of G, and subalgebras of M 
containing the algebra M© of fixed points, given by H <> M” . This is analogous to the 
relation between subfields K C L and Galois groups in Section 1.7.2. So what is the 
subfactor analogue of the index [IL : K]? 

Jones’ answer is the Jones index of the subfactor N C M: 


[M : N] := dimy(L?2(M)) > 1, (6.2.14) 


where L? (M) is the Hilbert space of Question 1.3.6. For instance, for any n > 1, [N ® 
M,(C) : N] = n°. If H < G are finite groups of outer automorphisms, then [M XG : 
MXH] = ||G\\/\|H|| =[M” : MF], where the crossed-product M xH and fixed-point 
M” factors are discussed in Section 1.3.2. 

The following theorem was completely unexpected. 


Theorem 6.2.9 [316] For any number 
d € {4cos*(/n)}%, U [4, oo], 
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there is a subfactor N C M of the unique hyperfinite type Il, factor M, with index 
[M : N] = d. Conversely, the index of any subfactor of a (not necessarily hyperfinite) 
type Il; factor will be in that set. 


In fact, the following rigidity is true: if M is the hyperfinite type II, factor, then at 
most four inequivalent subfactors N C M can possess the same index < 4. The reader, 
with Section 2.5.2 fresh in mind, may recognise the discrete sequence of indices in 
Theorem 6.2.9 as the square of the Perron—Frobenius eigenvalues of the A-D—E graphs — 
is this a coincidence? 

The key to proving Theorem 6.2.9, as well as the further developments, is the so- 
called basic construction, which appears to have been found independently by a number 
of people in the late 1970s. Let N C M be an inclusion of type I; factors. Even though 
M and N are isomorphic as factors, there is rich combinatorics surrounding how N is 
embedded in M. The Hilbert space L?(N) is naturally contained in L?(M). Let ey be the 
orthogonal projection of L?(M) onto L?(N). Then M and ey generate the von Neumann 
algebra (M, ey)” acting on the space L?(M). If the index [M : N]is finite, then (M, ey)” 
will also be a type II, factor, with index [(M, ey)” : M] = [M : N]. Moreover, since the 
trace (normalised so that tr(1) = 1) ona type I; factor is unique, we can unambiguously 
speak of the trace tr(ey), and we find it equals 1/[M : N]. For later convenience define 
t := 1/[M : N]. 

For example, taking N to be the fixed points M°, for some finite group G of outer auto- 
morphisms, then ey = (1/||G|l) pares tr(en) = 1/||G|| and (M, ey)" = M XG. This 
demonstrates the naturalness of this construction. What is the von Neumann algebra 
generated by M and e? The answer is the crossed-product MxG. 

We can repeat the basic construction indefinitely. Put My := N, M; := M and define 
inductively 


Mia := (M;, ei—1)", 


where e; := ey,_, is the orthogonal projection from L?(M,) onto L?(M;_1). We thus get 
a tower Mo C Mı C --- of type I, factors, and a sequence e1, e2, ... of projections. 
The limit Mæ := UX 9M, is also a type II, factor, with a unique (normalised) trace tr, 
which restricts to the unique trace on each M,,. Thus each tr(e,,) = t. The algebra Ao; 
spanned by the projections e; obeys the relations 


e = ete (6.2.15a) 
e;e;41e; = Te;, (6.2.15b) 
ee; =eje; if li— j| = 2, (6.2.15c) 
tr(xen41) = tr(x) T, (6.2.15d) 


where x is in the (finite-dimensional semi-simple) algebra An, generated by 
1, ¢€1,...,@n—1. In fact these are the complete list of relations for A,,,, because the 
(normalised) trace tr on any type II, factor obeys tr(xx*) > 0 with equality only if 
x = 0. The (easy) proofs of all these statements are in [319]. The point is that the tower 
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Mo C Mı C --- and the projections e1, e2, ... depend only on the original subfactor. 
Positive-definiteness of the trace on A, , gives the discrete values of Theorem 6.2.9. 

Of course we are now trained to recognise (6.2.15b) and (6.2.15c) as having to do with 
the braid groups. In particular, if we try to send the braid group generator o; to ae; + b, 
we obtain the solution a = t + 1, b = —1, where ¢ satisfies t + t7! + 2 = r~!. Thus to 
any finite index type II, subfactor, we get a representation of the braid group! 

We know how to go from a braid group representation to a link invariant: we need 
to associate a number with each braid that is invariant under the two Markov moves 
(Section 1.2.3). For a braid 6 € By, the combination 


= 1 eae deg £ 
Ip(t) = (- (v+ =)) Jt = tr(B) (6.2.16) 


works (verify this), where ‘deg £’ is defined in Section 1.1.4 and ‘tr(8)’ means the trace 
of the corresponding element in M,,. This function Jg is the famous Jones polynomial. 

Witten showed that the Jones polynomial can be recovered from the topological field 
theory (or modular category) associated with affine algebra A," at level k € N, when 
the highest weight wı + (k — 1)wo is assigned to each strand of the link. Of course, 
there is no need to restrict to A," or that weight, and other choices yield other link 
invariants. 

Can the subfactor approach also recover these other link polynomials, or is it inherently 
‘rank 1’? Is the full topological field theory (or if you prefer, the CFT or modular category) 
obtainable from the subfactor, or does the subfactor only see the link polynomials? The 
answer to both questions is yes; the construction was originally due to Ocneanu, and is 
explained carefully in [177] (see also [354] for a very accessible treatment of certain 
parts of the theory). The starting point is the realisation that the projections e; are only 
a small part of the full tower Mp C Mı C M2 C». 

Subtleties in any representation theory arise through the interplay of addition with 
multiplication, and with contragredient (dual). Addition (direct sum) of modules comes 
for free here. Unfortunately, the modules of factors (which we briefly described at the 
end of Section 1.3.2) don’t have an obvious tensor product, and in any case are rather 
colourless (e.g. there is a unique nontrivial module for type III factors). 

The right objects to study here are bimodules. We call a Hilbert space X = yXy an 
M-N bimodule if M acts on the left and N on the right. The point is that they have a 
natural multiplication: the relative tensor product (‘Connes fusion’) yXy @y Yp will 
be an M—P bimodule. The multiplicative identity (playing the role of the trivial one- 
dimensional module) is yL?(M)y, usually abbreviated to mMm. Given any bimodule 
mX, the conjugate Hilbert space X is naturally an N—M bimodule: nxm := m*xn*. 
Moreover, the possibilities for bimodules are far richer than for modules. 

Let N CM be an inclusion of II, factors with finite Jones index [M : N]. 
Recall the tower Mo = N C M = Mı C M2 C --- arising from the basic construc- 
tion. Let ®y denote the set of equivalence classes of irreducible M—M submod- 
ules of ®n>1 mL’(M,)m, and ®y that for the irreducible N-N submodules of 
r>0 NL? (M,,)y. We require these sets to be finite (‘finite depth’). Write HE g for the 
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N-M M-N 
Fig. 6.10 The principal and dual principal graphs associated with S3. 


(finite-dimensional) intertwiner space Homy—y(C, A m B). For any A,B € ®y, 
the product A ®y B can be decomposed into a finite sum È ceon NSC , where 
N Sh = dim HE g E€ N are the multiplicities. Indeed, all axioms of a fusion ring will 
be obeyed, except usually commutativity and self-duality. 

Returning to the Galois theory analogy, the Jones index merely corresponds to the 
degree of the field extension. To what corresponds the Galois group? Ocneanu’s answer 
is an intricate subfactor invariant called a paragroup [453] (see especially chapter 10 
of [177]). It consists of two graphs (the principal and dual principal), whose vertices 
are bimodules for M and N; an order-2 involution of the vertices corresponding to 
the contragredient map A +> A; and a ‘connection’, that is an assignment of complex 
numbers to closed paths in the graphs, reminiscent of 6j-symbols, describing the change 
between natural bases. The graphs are obtained from the fusion rings; their Perron— 
Frobenius eigenvalues equal the square-roots of the Jones index. For example, when the 
Jones index is < 4 (corresponding to eigenvalue < 2), those two graphs are equal, and are 
one of Ay, Deven, E6 or Eg (recall Figure 1.4) —it cannot be the tadpole T,, for elementary 
reasons, but D,gq and E7 are excluded for their inability to support a connection. Two 
inequivalent connections are possible on the E6 and Eg graphs, corresponding to different 
subfactors. Thus Theorem 6.2.9 indeed constitutes another realisation of A-D—E, and 
for the ultimate reason suggested in Section 2.5.2. 

A paragroup is a generalised (‘quantised’) sort of group. Figure 6.10 gives the 
graphs for RC RxG (for R the hyperfinite II, factor and G = S3). The M—M 
bimodules are parametrised by the irreducible characters ch; of G, with precisely 
ch;(e) edges connecting the ith node to the root of the graph. The N—N_ bimod- 
ules are parametrised by elements of the group. The contragredient involution and 
fusion rings are the ones familiar to aficionados of character tables: complex-conjugate 
and the character ring, and g > g7! and the group ring CG. The connection 
explicitly recovers the group structure, much as in the topological field theory of 
Section 4.4.2. On the otherhand, the graphs for RË C R are switched. More gener- 
ally, given any subgroup H < G, we get subfactors R? C R” and RxH C RxXG, and 
their paragroups give a group-like interpretation to G/H even when H is not normal. 

We say subfactors N; C M; are equivalent if there is an isomorphism 0 : Mı > M2 
with 0(N1) = N2. When M is hyperfinite type II, the paragroup identifies N C M upto 
equivalence. Hence, when G isa finite abelian group, R? C R isequivalentto R C RAG 
(when instead G is nonabelian, they are merely dual). 

The paragroup yields a topological invariant for manifolds, generalising the Turaev— 
Viro one [535] (see [354] for a very readable treatment of this part of the theory). 
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However, it doesn’t directly correspond to the data of an RCFT (e.g. the fusion rings 
of Figure 6.10 aren’t self-dual). To get RCFT data, we must pass from N C M to 
the ‘asymptotic inclusion’ (M, M’ N Mæ) C My, where M is the (weak completion 
of the) union of all M„. Asymptotic inclusion plays the role of Drinfel’d’s quantum- 
double here, and corresponds physically to taking the continuum limit of the lattice 
model, yielding the CFT from the underlying statistical mechanical model (see sec- 
tion 12.6 of [177]). All chiral data of the VOA or RCFT, including the link invariants, 
are obtainable from the asymptotic inclusion. For instance, the Jones index [M : N] 
equals 1/52). 

A very similar (but simpler) theory has been developed for type III factors. Bimodules 
now are equivalent to ‘sectors’, that is equivalence classes of endomorphisms à : N > N 
(the corresponding subfactor is A(N) C N). This use of endomorphisms is the key dif- 
ference (and simplification) between the type II and type III fusion theories. Given 
à, u € End(N), we define (A, u) to be the dimension of the vector space of intertwin- 
ers, that is all t € N such that tA(n) = u(n)t Vn € N. The endomorphism à € End(V) 
is irreducible if (à, à) = 1. Let ® be a finite set of irreducible sectors. The fusion 
product is given by composition à o u; addition can also be defined, and the fusion 
multiplicity My, is then the dimension (A o u, v). The ‘vacuum’ 0 is the identity idy. 
Restricting to a finite set ® of irreducible sectors, closed under fusion, the result is 
again a (noncommutative non-self-dual) fusion ring (after all, why should the composi- 
tions à o u and yz o A be related). The missing ingredients are nondegenerate braidings 
~(A, u) € Hom(à o u, u o à), which say roughly that A and u nearly commute (the €® 
must also obey some analogue of the Yang—Baxter equation (6.2.8)). Provided we have 
a nondegenerate braiding (which we can obtain from asymptotic inclusion as before), 
Rehren [470] proved that we will automatically have modular data. When we have a 
hyperfinite type III, subfactor N C M with a braided system of endomorphisms, there 
is a simple expression (see [65] and references therein) for the corresponding modular 
invariant (Definition 6.1.8) using ‘a@-induction’ (a process of inducing an endomorphism 
from N to M using the braiding €+): we get Zy, = (aj, a,,). The NIM-rep is defined 
similarly [65]. 

Wassermann and collaborators (see e.g. [554]) have explicitly constructed the affine 
algebra subfactors, recovering the affine algebra modular data, at least for A,“ and 
B,®. To any subgroup-group pair G < H, the subfactor RxG C RXH of crossed- 
products has a (in general non-commutative) fusion-like ring. But sometimes it will have a 
braiding — for example, the diagonal embedding G < G x G recovers the finite group 
modular data of Section 6.2.4. 

These approaches cannot reconstruct the full RCFT or VOA. To give a simple example, 
the VOA associated with any even self-dual lattice or the Moonshine module corresponds 
to the trivial subfactor N = M, where M is the unique hyperfinite type II, factor. The 
way to get more information uses nets of subfactors. 

There are two standard axiomatisations of quantum field theory (Section 4.2.4). The 
Wightman axioms, applied to two-dimensional CFT, yield quite naturally a VOA (see 
chapter 1 of [330]). Algebraic quantum field theory [269], on the other hand, leads to 
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subfactors. In particular, to any open set O in Minkowski space R}! we are to assign 
a von Neumann algebra A(©) C L(H) of observables localised to ©, obeying various 
properties (such as O; C O2 implies A(O,) C A(Q2)). The axioms imply these A(Q) 
will all be type III, factors. In two dimensions, choosing ‘light-cone’ coordinates xo + x1, 
we can take these O to be the product Z x J of open intervals Z, 7 C R. This means 
that for most purposes the theory decomposes into a one-dimensional net A(T) — the 
chiral theory. The one-dimensional ‘space-time’ R is compactified to S!, and requiring 
the theory to be covariant with respect to Diff(S!), the result is called a local conformal 
net. The theory of these one-dimensional nets should be equivalent to that of VOAs, and 
that of the two-dimensional ones to the full RCFT, though most details of this equivalence 
are still to be established. Nevertheless, some aspects of the theory will likely remain 
much more accessible using, for example, subfactors than VOAs (in particular, orbifolds 
seem simpler in subfactor theory). For references and results, see, for example, [341], 
[340], [568], [332] and references therein. 


Question 6.2.1. Prove equation (6.2.3a). 


Question 6.2.2. Find all Nim-reps for A," at each level k = 1,2,3,... (Hint: Verify that 
the Perron—Frobenius eigenvalue of M; is Sjo/So9 = 2 cos(r/(k + 2)) < 2.) 


Question 6.2.3. (a) Find a continuous one-parameter deformation of the three- 
dimensional complex Lie algebra span{x, y, z} with brackets [xy] = x, [xz] = [yz] = 0. 
(b) Verify that any continuous deformation of Aj is trivial. 


Question 6.2.4. Let M, N be left A-modules, where A is a Hopf algebra. Prove that 
Hom, (M, N) is a left A-module. 


Question 6.2.5. (a) When does the character table of a finite group, with rows and columns 
appropriately normalised and ordered, equal the S-matrix of modular data? 

(b) Let G be finite and abelian. Is the fusion ring for the quantum double D(G) (see 
Section 6.2.4) isomorphic to the group ring of G x G? 


Question 6.2.6. Let G be any finite group and consider the modular data of (6.2.12). 
Find the conjugation C, the simple-currents J and their action and monodromy @,, and 
identify the group of all simple-currents. Identify the Galois action and parities. 


Question 6.2.7. Prove that any finite group can be realised as a subgroup of the group of 
automorphisms of a holomorphic VOA. (Hint: think of self-dual lattices.) 


Question 6.2.8. Identify the knot group 7 (R* \ T) of the trefoil, using the Wirtinger 
presentation of Figure 6.8. 


Question 6.2.9. Prove, using the Reidemeister moves, that the Wirtinger presentation 
yields the same group no matter which knot diagram is chosen for the given knot. 


Question 6.2.10. Recall (6.2.15). Find all values a,b such that c; > ae; +b, i = 
1,..., — 1, yields a representation of the braid group B, in A,r. 
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6.3 Hints of things to come 


String theory has profoundly affected geometry (e.g. elliptic genus and mirror symmetry), 
algebra (e.g. VOAs) and topology (e.g. knot invariants), but so far it has had little impact 
on number theory. That may have something to do with the knowledge and interests 
of the individuals who have developed its mathematical side. There are in fact several 
indications of deep relations with number theory, waiting to be developed. In this section 
we sketch some of these. 


6.3.1 Higher-genus considerations 


String theory tells us that CFT can live on any surface X. The VOAs, including the 
geometric VOAs of Section 5.4.1, capture CFT in genus 0. The graded dimensions and 
traces considered above concern CFT quantities (‘chiral blocks’) at genus 1: t +> e?7' 
maps H onto a cylinder, and the trace identifies the two ends. But there are analogues of 
all this at higher genus [573] (though the formulae can rapidly become awkward). We 
have alluded to this throughout the book so will only add some quick remarks here. Our 
main point is that this is surely the direction for important future research, with direct 
implications to Moonshine. 

For example, the graded dimension of the V” CFT in genus 2 is computed in [533], 
and involves, for example, Siegel theta functions. The higher-genus mapping class group 
representations coming from the A; RCFT are studied in [220]. A more radical sug- 
gestion, using projective limits, is given in Section 4.3.3. 

The orbifold theory in Sections 5.3.6 and 7.3.2 is genus 1: each sector (g, h) corre- 
sponds to a homomorphism from the fundamental group Z? of the torus into the orbifold 
group G (e.g. G = M) — g and A are the targets of the two generators of Z? and so must 
commute. More generally, the sectors correspond to homomorphisms ¢ : 7,() > G, 
and for each we get a higher-genus trace Z(g), which are functions on the Teichmüller 
space T, (generalising the upper half-plane H for genus 1). The action (7.3.3) of SL2(Z) 
on Ngn) generalises to the action of the mapping class group on 7() and Zz. 

For example, we can count the number of inequivalent homomorphisms z;() > G, 
for G a compact genus-g surface. This number is given by Verlinde’s formula (6.1.2) 
together with the expression (6.2.12a) [194]: 


NEO = 5 


( (6.3.1) 
h chelrr(Cg(h)) 


lCa WNT? 
ch(e) , 


where we sum over representatives h of the various conjugacy classes of G. 


6.3.2 Complex multiplication and Fermat 


A few years ago Philippe Ruelle was walking in a library in Dublin. He spotted a yellow 
book in the mathematics section, called Complex Multiplication [367]. A strange title 
for a book by Lang! Ruelle flipped it to a random page, which turned out to be 26. There 
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he found what we would call the Galois selection rule (6.1.15a) for Az", analysed 
and solved for the cases where k + 3 is coprime to 6. Lang, however, knew nothing of 
modular invariants; he was reviewing work by Koblitz—Rohrlich [351] on decomposing 
the Jacobians of the Fermat curve x” + y” +z” = 0 into their prime pieces, called 
‘simple factors’. 

Fix n > 3. Let F, denote the nth Fermat curve, that is the projective complex curve 
x” + y” +z” = 0. We will describe some similarities with the modular data of A,“ at 
level k =n — 3. 

First, let’s review some A,“ chiral data. Call a pair (r,s) € N x N admissible if 
1<r,s andr +s <n. The integrable highest weights à € PHA,’ ) are in one-to-one 
correspondence with the admissible pairs, given by Ags) := (n —r —s — 1)wo + (r — 
1)@, + (s — 1)@2. For any admissible (r, s), define 


H, s = {£ € Zy | (er) + (£s) <n}, 


where Z% is (as always) the multiplicative group (mod N) of integers coprime to N , and 
(a) is the unique integer 0 < (a) < n congruent to a (mod n). Then Z%, is the Galois 
group over Q of the field generated by all entries S,,, of the A," level-k matrix S. The 
Galois selection rule (6.1.15a) says that if Z is a modular invariant, then 


Dees shies oj # 0 > H, s = Hp s. 


The hard part of the A,“ modular invariant classification involves solving this condition 
H, s = Hy y [232]. 

Before we compare this to F’,, let’s introduce some geometric terminology. An abelian 
variety is a torus of the form C” /L, where L is a 2m-dimensional lattice in C”, which 
admits an embedding into projective space. This means there is a Hermitian form on 
C” (defined in Section 1.1.3), whose imaginary part takes integer values when restricted 
to L. Most tori (when m > 1) don’t satisfy this Hermitian form condition, though it is 
automatic when m = 1. We say two abelian varieties C” /L and C”/L’ are isogenous if 
there exists a continuous group homomorphism from one to the other that is surjective; 
equivalently, if there is an invertible complex-linear endomorphism of C” taking the 
lattice L onto a sublattice of L’. Isogeny is an equivalence relation preserving most 
things of interest. 

Now suppose an abelian variety C” /L contains another, C” /L', of dimension n < m. 
Then the Hermitian form can be used to show that the original variety is isogenous to 
the product of C”/L' with some C”~"/L” (roughly, L” is the orthogonal complement 
of L’ in L). Continuing in this way, we get that any abelian variety is isogenous to the 
product of simple factors, where simple factor means an abelian variety containing no 
proper abelian subvariety. 

A very special property that an abelian variety may possess is complex multipli- 
cation. The general definition is a little too complicated to get into here (see chap- 
ter 1.4 of [367]), so let’s restrict to one-dimensional abelian varieties, that is the torus 
A, = C/(Z + tZ). We say A; has complex multiplication if its endomorphism ring 
End(A,) is strictly greater than Z; equivalently, if there is a non-integer z € C such that 
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z(Z+tZ) C Z+tZ (hence the name). It turns out that if A, has complex multiplica- 
tion, then (among other things) j(t) is an algebraic integer. This illustrates just how rare 
complex multiplication is: only countably many A, have it. It also illustrates its number- 
theoretic significance, which only becomes more profound as the dimension rises. 
We get an abelian variety from any complex projective curve, by taking the Jacobian 
(Section 2.1.4), which is of complex dimension equal to the genus. In the case of the 
n-1 


Fermat curve F,, the genus is ( n ), which equals the cardinality || PŁ (AP) ||. A bijection 
between P A cay ) and a basis of holomorphic 1-forms is 


dx 
yr? 
for any admissible (r, s). For each (r, s) let [r, s] denote the H, ,-orbit {((¢r), (€5) )}eeu,,- 
Then the Jacobian Jac(F’,) is isogenous to the product, over all orbits [7, s], of a |Z% || /2- 
dimensional abelian variety Aj s], form = n/gcd(r, s,n —r — s). All Ajs] have com- 
plex multiplication, which simplifies the following analysis. 

We wish to decompose Jac(F„) into a product of simple factors. Thus we need to 
know when the Ap s] are isogenous to one another, and also when they are simple. Both 
questions reduce to knowing when H, s = Hr s, which as we mentioned earlier is also 
the key step in the A>" modular invariant classification. 

Similarly, Itzykson discovered traces of the Az" exceptionals — these occur when 
k+3 = 8, 12, 24 — in the Jacobian of F24. See [46] for additional observations. 

The point is that the combinatorial heart of two very different problems — the decom- 
position of the Jacobian of Fermat curves into simple factors, and the classification of 
RCFT associated with AS — are identical. Nevertheless, this must seem a little ad hoc. 
What is needed are other independent probes of this (still hypothetical) relationship. One 
possibility, suggested by the presence of complex multiplication, is the following. 

Basic data associated with an algebraic variety V is its zeta-function L(V, s), which 
counts its points over various finite fields. Isogenous varieties have equal zeta-functions. 
The Mellin transform of the zeta-function (Section 2.3.1) formally gives a q-series 
fv(t) = >, anq". For a typical variety V, fy won’t have any special properties, but 
when V has complex multiplication, the zeta-function decomposes into a product of 
Hecke L-functions, and their g-series do have modularity properties [505], [506]. 

Thus, associated with the abelian varieties Ap s] — by virtue of complex multiplication — 
are various sorts of modular forms. And associated with the weights Àg s) — by virtue 
of being integrable highest weights of an affine algebra — are various sorts of modular 
forms. 


P12 is— l 
Aas) > Wrs) = a y? 


Problem How are the modular forms associated with the zeta-functions of the factors 
Ajs] in the Jacobian of the Fermat curve F, related to the modular forms associated 
with integrable highest-weight modules of Az") at level n — 3? 


The easiest n to check will be n = 4, 6, 8, 12, since for them Jac(F,,) is isogenous to a 
product of elliptic curves. A somewhat related project, concerning A,‘”, is proposed in 
[490], though nothing definite has been achieved there yet. 


Hints of things to come 395 


In any case, these Fermat <> A2” ‘coincidences’ are still not understood. It is tempt- 
ing to guess that, more generally, the A, level-k modular invariant classification is 
somehow related to the hypersurface xý + +--+ x) =z", forn = k +r +1, but this is 
probably too naive. As with other meta-patterns, the most realistic hope wouldn’t be 
to find a direct connection between Fermat curves and the RCFTs associated with sh. 
Rather, the idea is to identify the combinatorial nugget common to both. The real hope 
would be that this ‘coincidence’ lies in a series: A~D—E for sl,, Fermat for sl, ..., and 
that this would lead to insights into sl, RCFT and beyond. 

Complex multiplication in CFT has been the subject of other work — see [435] for 
several references. Let’s mention two examples. Arithmetic varieties related to number 
fields seem to be naturally selected in the study of black holes in Calabi-Yau compacti- 
fications of string theory [435]. It has been conjectured [268] that superconformal field 
theory with target space given by a Calabi—-Yau manifold M will be rational iff both M 
and its mirror have complex multiplication. 


6.3.3 Braided # 6: the absolute Galois group 


The absolute Galois group of the rationals is the group of symmetries of the field of 
algebraic numbers. It is the most important, and poorest understood, group in algebraic 
number theory. But it also has deep contacts with geometry (through the generalised 
Riemann existence theorem), and there have been several proposals conjecturing its 
relevance to RCFT (see e.g. [128], [435], [268] and references therein), and even quantum 
field theory [106], [93]. 

Recall the discussion of algebraic numbers and Galois groups in Section 1.7. The 
algebraic closure Q of the rationals is the set of all algebraic numbers, or equivalently 
the union of all finite-dimensional field extensions of Q. The absolute Galois group of Q 
is TQ := Gal(Q/Q). It’s uncountably infinite, and extremely complicated. Only two of 
its elements have names: the identity and complex conjugation. If K is any finite Galois 
extension of Q, then its Galois group G = Gal(K/Q), which will be a finite group, is 
a homomorphic image of lQ and so is a quotient 'g/N of Fo. Much effort has been 
devoted to discovering which groups G can arise as Galois groups over Q (see [548] for 
a review of the so-called inverse Galois problem). 


Conjecture 6.3.1 Any finite group G is a quotient of Vg. 


This conjecture shows just how complicated Ig is. Incidentally, there are many nontrivial 
points of contact between braid groups and inverse Galois theory (see e.g. [549]). 

Tg is an example of a profinite group, that is a projective limit of finite groups (here, 
of the Galois groups G). We define projective limit in Section 2.4.1 — the indexing set 
here are the fields K, ordered by inclusion, to which is attached its Galois group G. This 
just means that ø € Ig consists of a choice of Galois automorphism ox for each finite 
extension K D> Q, which obeys the obvious compatibility constraint (if K C L, then og 
restricted to K must equal ox). Thus, if the conjecture is true, Fo would be the limit 
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lim- G of all finite groups, in this sense. Of course any finite group is also a quotient of 
some free group F,, and so we may wonder if Ig and F, are somehow related. 

Thanks to their realisations as fundamental groups, the braid group 8, acts faithfully on 
Fa (Question 6.3.5) — in other words, B, can be regarded as a subgroup of Aut(F,,). This 
can be seen as follows. Recall the space €, of (1.2.6). We have the obvious projection 
a: €n41 > Cn, given by forgetting the (n + 1)th point. Hence x induces an action 
of the fundamental group 7r(€,,) of the base on the fundamental group of the fibre 
m'(z1,...,2Zn) = C \ {z1,..., Zn}, that is an action of the pure braid group P,, on Fa. 
The action of B, is obtained similarly. We will find that similar reasoning allows us to 
replace 6, by TQ, and F, by its profinite completion. 

Let X be an algebraic variety defined over Q — that is, X is defined as the set of 
solutions (z1, ..., Zn) E€ C” to a collection of polynomials p;(z1, ..., Zn) = 0, and the 
polynomials have coefficients in Q. Let X (Q) be the set of points (z1,..., Zn) € X with 
all coordinates z; € Q. Fix a base-point p € X(Q) (assuming one exists). 

Let N be a finite-index normal subgroup of 7;(X, p). Then by the geometric Galois 
correspondence (Section 1.7.2), N corresponds to a finite Galois cover fy : Xy > X 
of X, with mı(Xn) = N and the quotient 2\(X, p)/N can be identified with the set of 
homeomorphisms y : Xy —> Xx Satisfying fy o y = fy. Each y, restricted to the finite 
set fy '(p), will be a permutation, and this permutation uniquely determines it. 

By the generalised Riemann existence theorem (Grauert-Remmert, 1958), each finite 
cover Xy of X is an algebraic variety defined over Q. Thus each automorphism o € Tro 
permutes the finite covers of X (or if you prefer, the normal subgroups N): it acts on X y 
by acting simultaneously on the coefficients of all the defining polynomials of X y. 

Grothendieck [267] explained that To acts on the profinite completion 71(X, p) of 
the fundamental group of X, called the algebraic fundamental group of X . This means 
the following. The profinite completion G ofa group G is the projective limit lim.G/N 
over all finite quotients G/N (i.e. N runs over all normal subgroups of finite index in G). 
Anelement g € G consists of a choice gn N of coset in G/N for each such N, such that 
whenever Nj is a subgroup of N, then gy,N2 = gn, N2. This should remind us of the 
construction of the p-adic integers Z p — indeed, Z = JI 5 Z p is the profinite completion 
of Z. Profinite completion is the algebraic analogue of the topological completion of a 
space by Cauchy sequences (as in the construction of R from Q). Its purpose is the same: 
just as R fills in the ‘gaps’ in Q, so does G supply the missing elements in G. For example, 
/2 exists in Z but not in Z. Of course, being a projective limit, the profinite completion 
is also an ‘integration’ of all G/N, that is a way of treating them all simultaneously. A 
solution in Z toa a polynomial equation gives us simultaneously a solution modulo any n. 

For example, teZ corresponds, for each n € N, to an integer £ tn defined modulo n, 
subject to the obvious compatibility condition. Then an element Tis invertible, written 
T €E ZX, iff foreachn > 1, Ln is invertible mod n. Hence any € E€ Z* has a well-defined 
action on finite-order roots of unity: given any nth root of unity &, &° is defined to be °“. 
In fact, consider the field Q“’ obtained by taking the union of all cyclotomic fields (or 
equivalently, by Theorem 1.7.1, all abelian extensions of Q). Its Galois group Gal(Q”” /Q) 
can be naturally identified with the multiplicative group Z* in this way. This is just the 
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action of Ig restricted to cyclotomic fields — call this restriction the cyclotomic character 
xO? : To > ZX (this is a ‘character’ in the sense of a one-dimensional representation, 
not as a trace of a higher-dimensional character). This action has a large kernel — in fact, 
Z* is isomorphic to the abelianisation ro/[l QrQ]. 

Let Y € M(X, p), that is for each finite-index normal subgroup N of xı(X, p), we 
have a coset representative Py of some coset Py N € 71(X, p)/N and these Yy — which 
we are to think of as permutations of finite sets fy '(p)—are compatible in the appropriate 
way. Then for any o € Tg andy € 7(X, p)/N, the action o.¥ is defined by 


(..)n =0 0 Priyo 0t, (6.3.2) 


where ø acts on the points in fy Ip) c Q component-wise, and acts on the normal 
subgroups N as above. As we will see, choosing the variety X appropriately, (6.3.2) 
includes the profinite analogue of the braid group action on F, mentioned earlier: the 
image of Tọ in At F, lies in this image of Bn. Equation (6.3.2) generalises to an action 
of Tọ on the fundamental groupoids 7(X, p, q) of (homotopy equivalence classes of) 
paths in X with endpoints p,q € X(Q). 

Now, generically xı(X, p) is isomorphic to the mapping class group I’,,,, when X is 
a surface of genus g with n punctures. By the modular tower we mean the collection of 
moduli spaces M, „, where the different spaces are related by the obvious topological 
actions such as forgetting marked points, or sewing surfaces together (‘tower’ means a 
family of objects linked by homomorphisms). In Section 2 of his Esquisse d’un Pro- 
gramme, Grothendieck conjectured that Ig acts on the profinite completion of this tower 
(i.e. on the profinite completion of all I’, ,,, and respecting those topological actions), and 
is in fact the full automorphism group of this completion, and that this provides an effec- 
tive, almost combinatorial, way to study Ig, not directly related to its action on algebraic 
numbers. He conjectured that his profinite modular tower could be reconstructed from 
Mo,3, Mto,4, Nti1, with all relations obtained from Mto,5 and Mt, 2. 

For example, the ordered moduli space Mo,4 is the thrice-punctured sphere P!(C) \ 
{0, 1, co} and can be defined over Q — indeed, it is just r (2)\H and has defining equa- 
tion 2125(z2 -1 = 25 -z7 +1. Its fundamental group is Fz, the free group on two 
generators. Therefore, Ig acts on Fr. In fact, this action is known to be faithful (Belyi, 
1987), so Ig is a subgroup of Aut Pr. Similarly, we get an action of Ig on Bn. which 
we will give shortly. This action yields one on B, /Z (Bn), and for n = 3 the latter equals 
the completion PSL) of the modular group (recall (1.1.10b)). 

Does Moonshine (or if you prefer, RCFT or VOAs) see this same 'g-action? After all, 
modular data possesses a nice Galois action (6.1.7), as does the spectrum of the theory 
(6.1.15b). Also, Grothendieck’s modular tower, with generators (0, 3), (0, 4), (1, 1) and 
relations (0, 5) and (1, 2), reminds one of the Moore—Seiberg data of Section 6.1.4. There 
are a few difficulties with this hope. For instance, we should take profinite limits of these 
actions — for example, lift our action on SL2(Z) to one on SLZ). Can that have any 
natural meaning to RCFT? Also, and most disappointingly, the modular data always lies 
in cyclotomic fields, so the Galois action (6.1.7) in RCFT really only sees the rather 
uninteresting action of the abelianisation ro/[F oro] = zx , as explained earlier. 
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The first difficulty is easy to address. Subject to Conjecture 6.1.7, we obtain the 
following universal actions of Fo on modular data: for any o € Tg, 


oT = T?" O, (6.3.3a) 
os = THOM) SPOON grx O 52. (6.3.3b) 


In order for (6.3.3) to make sense, these equations must live in the profinite completion 
of SL2(Z). This is the meaning of the profinite completions here: the ‘integration’ of the 
data of all RCFT (or VOAs) necessary for universal formulae. The generators S, T of 
SL2(Z) also generate SLZ), though in the topological sense (i.e. just as 1 topologically 
generates Z). Since the action (6.3.3) is continuous, it defines a Mg-action on SL2(Z). It 
is very natural, in the sense that there is a map Fo > SLZ) given by 


x@) 0 


AES Ge es THOM) oP OMAN rL Og = 
0 x(o"') 


) e, 


(6.3.4) 
and o.§ equals the matrix multiplication Gs S. This map (6.3.4) is also what gives the 
Galois action (2.3.14) on modular functions for T (N ) or, in more suggestive language, the 
meromorphic functions on lim F(N )\H (see Section 2.4.1). Of course, in RCFT there 
is a preferred basis for this SL3(Z)-representation (namely, that given by the VOA char- 
acters), and in that basis the matrices become signed permutation matrices €g (a) ĉar p. 
It will be extremely interesting to find universal formulae for the Galois action on the 
remaining Moore—Seiberg data. The difficulty is that, in obtaining (6.3.3), we were 
guided by the presence of a preferred basis, and so (6.3.3) reduces to the usual Galois 
action on the corresponding matrices. For the braiding and fusing matrices, typically 
there isn’t a preferred basis, and so other principles must be our guide. 

Why do cyclotomic fields exhaust RCFT, hence demanding that the RCFT Galois 
action, unlike that on Grothendieck’s modular tower, be far from faithful? Is it trying to 
tell us something? What other principles can guide us to a Galois action on the remaining 
Moore-Seiberg data? 

Those questions lead us to Drinfel’d [161]. Recall from Section 1.6.2 that the pure braid 
group P, acts on each set HomA; ®--- ® An, V) in any braided monoidal category. 
In particular, we can ask which subgroup of P3 x Pz acts on the set of all braided 
monoidal categories, where 6 € P3 and y € Pz send the associativity constraint a : 
(A ® B)@®C > AQ (B @C) and the commutativity constraint c : A®@ B —> B@A, 
respectively, of one such category to that of another. We require that 8.a and y.c satisfy 
the various axioms, most importantly the pentagon and hexagon equations. 

Dualising this, Drinfel’d suggested to act with P3 x P2 on the data of quasi-triangular 
quasi-Hopf algebras A (defined in e.g. [98]). These algebras are co-commutative up to 
conjugation by the R-matrix R € A @ A (asin Definition 6.2.8), and co-associative up to 
conjugation by the associator P E€ A @ A @ A (® measures how A fails to be Hopf). ® 
and R are required to obey the triangle, pentagon and hexagon equations of Section 1.6.2. 
We met quasi-triangular Hopf algebras in Definition 6.2.8; it will be clear shortly why 
Drinfel’d prefers quasi-Hopf algebras. Identify P2 with Z and P3 with Fz x Z (1.1.10c); 
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then m € P, acts on the R-matrix by m.R = R.(R2k)” and, for example, a word 
f(x, y) € Fy < P3 acts on the associator by f.® = f (Ra Ry2, PR32Rp30 |)! ®. The 
other quantities in the algebra A are left unchanged. Unfortunately, this nice idea fails: 
only the two elements (+1, 1) € P2 x P3 satisfy the constraints and thus permute quasi- 


triangular quasi-Hopf algebras (the nontrivial one sending R to 72, and fixing everything 
else). 

Drinfel’d then proposed that there would be more solutions if we take profinite com- 
pletions (indeed, this is a raison d’étre of completions), so in place of P2 = Z and 
P3 = Z x Fo we take Py = Z and Pz =x F. To get these profinite actions on the 
R and ®, it suffices to take the scalars of the algebras A to be formal power series 
QILA] rather than C. The hope is that by completing the groups, there is more chance of 
nontrivial solutions to the triangle, pentagon and hexagon equations. The details would 
take us too far afield, but the result is that there are indeed several solutions. 

Drinfel’d was interested in this because, in an earlier paper, he had found, for each 
choice of simple Lie algebra g, a universal formula for one solution (®, R) to those 
equations, using Kohno’s monodromy theorem for the KZ connection. Unfortunately 
this formula for ® is quite complicated. In [161] he investigates two commuting actions 
on the set of all solutions (®, 7), which he uses to deduce the existence of a simpler 
solution n (see [39]). One of these actions was this pure braid group action. 

__Let GT, the e Grothendieck— Teichmüller group, be the group of all pairs (A, f) € Zx 
Pa (the Z of P; can’t contribute) satisfying those equations and thus permuting those 
quasi-triangular quasi-Hopf algebras. GT is large, in fact as we will see gq embeds as 
a subgroup in it. Drinfel’d conjectured that GT should act on the profinite completion 
of Grothendieck’s tower. For example, on Bn , topologically generated as we know by 
O1,.--, On—1, We get the action by (A, f) € GT given by 


(A, f).o; = f (vi 07) oè f (yi, of), (6.3.5a) 
Q, f)Z = Z*, (6.3.5b) 


where Z = (Oy-1 °° o topologically generates the centre of B; (just as it does that of 
Bn) and y; = oj-1- OP. -o;-1. This element y; arises in presentations of the genus- 
0 mapping class groups Ip, or braid groups of the sphere [59]. The ‘profinite word’ 
fO1 GF; De E means the value (f) of the homomorphism ¢ : F > B, defined by 

pŒ) = yi and (y) = 0. = 

Moreover, Ig maps injectively into GT and so can be identified with some subgroup 
of GT. Conjecturally, To equals GT. For example, (—1, 1) corresponds to complex- 
conjugation. See [305], [494], [493], [39] and section 16.4 of [98] for reviews of GT and 
its action on, for example, the modular tower; [128] speculates on its relation to RCFT. 

This is brought one step closer to RCFT by Kassel—Turaev [339]. It is relatively 
straightforward to extend Drinfel’d’s action to certain braided monoidal categories. In 
[339] a “pro-unipotent completion’ R is defined for any ribbon category R. R is itself 
a ribbon category, with the same objects as R, but with each Hom(A, B) replaced by 
some projective limit of its linearisation over Q = |] p Q p- For example, for the choice 
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R = ribbon, Homi, Ø) can be identified with the space of formal finite linear combi- 
nations over Q of framed oriented links in R*. Drinfel’d’s work yields an action of Tro 
on the collection of these ribbon categories. 

This category ribbon obeys a universality property as in Theorem 1.6.2. Now, any 
automorphism o € Ig acts on the data of ribbon to produce a new ribbon category 
ribbon’. Its objects and Hom(J, Ø) are unchanged. By universality, there is a functor 
from ribbon to ribbon’, sending Hom(%, Ø) to itself. That is, we get an action of ro 
on the Q- span of links: a (framed oriented) link L is taken to some linear combination 
(over Q) of links. IQ also acts on related spaces, such as Q- valued Vassiliev invariants 
[339]. 

For example, complex-conjugation sends a link L to its mirror reflection (in general a 
link is not isotopic to its mirror reflection — see footnote 6 in chapter 1). However, [339] 
show that this Ig action is trivial on the commutator [l oT Q], and thus really is an action 
of Z™. 

This action is clearly very similar to that of RCFT. As we know, RCFT attaches the 
matrix S to the Hopf link (Figure 6.9). Complex-conjugation (A = —1 € Z*) sends the 
Hopf link to its mirror image; the mirror image corresponds to S, which is what (6.3.3b) 
reduces to for A = —1. 


Problem Zdentify the relation between [339] and the action (6.3.3) in RCFT. Can this 
be used somehow to identify the Galois action on arbitrary Moore—Seiberg data? 


We conjecture these actions are identical or at least very close. After all, they both 
factor through to Z* and agree with complex-conjugation applied to the Hopf link. 
Theorem 4 of [40] should make it possible to compute the [339] action on the Hopf link 
for any A € Zz , thus allowing us to compare it directly to (6.3.3a). As we’ve learned, 
there are topological underpinnings of chiral RCFT data (e.g. the modular categories of 
[534], [32]) as well as full RCFT (see e.g. [211]); this seems the obvious way to attack 
this problem. 

At least as interesting as this Galois action on the Moore—Seiberg data is that we can 
also hope that Ig (or at least Z ) will act on the spaces 8‘%-”) of chiral blocks, since they 
do on B&D, i.e. on the characters, which are modular functions (recall Section 2.3.3). 

The Galois action (6.3.3) of RCFT is not directly related to Grothendieck’s (6.3.5). 
The RCFT action would seem to be intimately related to Congruence Property 6.1.7, so 
more relevant to RCFT than SL should be the much simpler lim- SL2(Z)/ T(N) = 
SL2(Z). 

So far in this subsection we’ve only addressed CFT ‘in the bulk’. What if anything does 
Galois do to, for example, D-branes? Indeed, an action persists in boundary RCFT, though 
it is more complicated [235]. In particular, this Galois action will no longer be abelian — 
the algebraic numbers involved belong to exponent-2 extensions of the cyclotomic field 
Q*’. This complication opens the door to much more interesting mathematics. 

It will be interesting to see if the Z* action in [106] can be related to that of RCFT. We 
are to think of RCFT as being to generic quantum field theory what semi-simple finite- 
dimensional Lie algebras are to generic ones. In this spirit, this Galois action on RCFT, 
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and its relation to To and Grothendieck’s Esquisse, can be regarded perhaps as a toy 
model for the much more ambitious Cosmic Galois Group of [93], which conjecturally 
underlies the multiple zeta values found by Kreimer and others in more physical quantum 
field theories. 

As a final remark, it is quite possible that the Galois actions explored in this subsection 
are related to the Fermat remarks of last subsection (see in particular section II of 
[304]). The Fermat curve Fy = {x + y¥ = 1} x" + y^ = 1 is an abelian cover of 
P'(C) \ {0, 1, œœ}; in turn, its abelian covers are controlled by torsion points on its 
Jacobian Jac(F'y ), and in [304] the action of Fo on A is studied via those torsion points, 
with results somewhat reminiscent of Section 6.3.2. 


Question 6.3.1. Use the fact that the S$ and T matrices of (6.1.8) define modular data to 
compute the sum J`», emim n, (Note: This is called a Gauss sum. A similar calculation 
yields a generalisation of Gauss sums for any modular data.) 


Question 6.3.2. Find all t € H such that the torus C/(Z + Zr) is isogenous to C/(Z + 
Zi). 

Question 6.3.3. Prove that the elliptic curves y? = x? + ax and y? = x° + b both have 
complex multiplication for any a, b. 


Question 6.3.4. What is the profinite completion G for finite groups G? 


Question 6.3.5. (a) Define o;.x; = xj if j Ai,i +1, and o;.x; = Xi41 and o;.Xi+1 = 
pierre sees Verify that this is a well-defined action. (It turns out that this action is 
faithful.) 

(b) Verify that for any 6 € B,, P fixes x, - --x,, and there is a permutation zg and words 
a; E€ Fa such that 6.x; = Ay Ar (It turns out that, conversely, any automorphism 8 
obeying those two conditions must come from this braid group action. This gives a way 
to solve the word problem in 6,,.) 


Question 6.3.6. Choosing X to be a sphere with two punctures, describe the associated 
Tg-action (6.3.2). 


7 


Monstrous Moonshine 


Thomas Edison once said that to invent you need a good imagination and a pile of junk. 
Let’s see what some imagination can do. 

This book has been about Moonshine: a diverse collection of points-of-contact between 
algebra, number theory and mathematical physics, which nevertheless has a common 
theory. The most remarkable example of Moonshine is surely the association of Haupt- 
moduls with elements of the Monster M. It is to this we finally turn. 

The reader should reread the introductory chapter, which quickly sketches the basics 
of Monstrous Moonshine. In this chapter we explore this in more detail. The original 
article [111] is still very readable and contains a wealth of information not found in other 
sources. Other reviews are [107], [410], [73], [154], [412], [249], [75], [469], [78], [237] 
and the introductory chapter in [201], and each has its own emphasis. 


7.1 The Monstrous Moonshine Conjectures 


Recall from the introductory chapter the McKay equation 
196 884 = 196 883 + 1. (7.1.1) 


The number on the left is the first nontrivial coefficient of the j-function, and the num- 
bers on the right are the dimensions of the smallest irreducible representations of the 
Fischer—Griess Monster M. On the one side, we have a modular function; on the other, a 
sporadic finite simple group. Monstrous Moonshine explores this completely unexpected 
connection between finite groups and modular functions. 

The world is full of coincidences, and it isn’t always clear how seriously they should 
be regarded. For instance, at the heart of Monstrous Moonshine is a holomorphic c = 24 
VOA; the conjectured number of holomorphic c = 24 VOAs [488] is 71, and this is the 
largest prime dividing ||M||. There are 26 sporadics, 26 generators in a presentation of 
the Bimonster discussed shortly, and 26 conjugacy classes in the largest Mathieu group 
M),. Are any of those numbers related to the 24 of Section 2.5.1, the k-group Zag of the 
integers or the number (24) of 24-dimensional even self-dual lattices?! 

Nor is physics immune to such thoughts. The great physicist Dirac noticed [140] that 
the ratio of the electrostatic to gravitational force between the proton and electron in a 


l Perhaps this Mathieu group remark is related somehow to the fact that for subgroups G of SL3(C), the 
Euler number of a minimal resolution of the quotient singularity C?/G equals the number of conjugacy 
classes of G [143], [471]. 


The Monstrous Moonshine conjectures 403 


hydrogen atom is a number N of order 10*°. He computed that the ratio of the mass 
of the universe to the mass of a proton is roughly N°, and that the ratio of the age of 
the universe with the time needed for light to travel across the classical radius of the 
electron is again roughly N. One can add that VN is roughly Avagadro’s number, so 
gives a measure of the minimum number of molecules needed in a macroscopic object. 
Dirac argued that the simple functional relation of these numbers indicates that they are 
all somehow physically related. 

What distinguishes (7.1.1) from some of these other coincidences is that the more it 
was studied, the more the coincidences multiplied, and the more structure was revealed. 

A noble goal for mathematics is surely to find interesting and fundamentally new 
theorems. Both history and common-sense suggest that to this end it is most profitable to 
look simultaneously at both exceptional structures and generic structures, to understand 
the special features of the former in the context of the latter, and to be led in this way 
to a new generation of exceptional and generic structures. That is the spirit in which 
Monstrous Moonshine should be studied. 


7.1.1 The Monster revisited 


Recall the finite simple group classification discussed in Section 1.1.2. The sporadics are 
summarised in Table 7.1 (its dates are only approximate and the list of investigators is 
taken from [109]). The Monster M is the largest of these 26 sporadic groups. Its existence 
was conjectured in 1973 by Fischer and Griess, and finally constructed (somewhat artifi- 
cially) in 1980 by Griess [263]. Tits [528] showed that M is the automorphism group of 
a 196 883-dimensional commutative non-associative algebra also constructed by Griess 
and now called the Griess algebra (Griess showed only that M was a subgroup of that 
automorphism group). We now understand the Griess algebra as the first nontrivial tier 
(0-mode algebra) of a VOA, the Moonshine module V}, lying at the heart of Monstrous 
Moonshine. 

The Monster has 194 conjugacy classes, and so that number of irreducible represen- 
tations. Its character table (and other useful information) is given in the Atlas [109], 
where we also find analogous data for the other simple groups of ‘small’ order. Table 7.2 
gives the upper-left 0.25% or so of the character table of M. The name ‘4C’, for exam- 
ple, is given to the third smallest (hence ‘C’) conjugacy class of elements of order 4. 
Table 7.2 tells us that the dimensions of the smallest irreducible representations of M 
are 1, 196 883, 21 296 876 and 842 609 326. 

The centralisers Cg(g) of conjugate elements are isomorphic (why?). The centralisers 
for all classes of order up to 11 are given in table 2a of [111]. The first few are Cm(2A) = 
2.B, Cm(2B) = 27°.Co1, CMA) = 3.F i5,, CMB) = 3'°.2.Suz, Cm(3C) = 3 x Th. 
We follow the notation of [109]: by, for example, ‘2.I1B’ we mean a group with Z3 as a 
normal subgroup and B as the quotient, or equivalently an extension of B by Z2. Of course 
the centraliser Cg(g) has (g) as a subgroup of its centre, hence (g) is normal in Cg(g) — 
that is, for example, the ‘2’ in 2.B. Knowing the centraliser, the sizes of the conjugacy 
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Table 7.1. The 26 sporadic groups 


Group Exact order Approximate order Investigators 

My 2* 375.11 7.9 x 10° Mathieu (1861, 1873) 

Miz 2°.33.5.11 9.5 x 10+ Mathieu (1861, 1873) 

Ji 23.3.5.7.11.19 1.8 x 10° Janko (1965) 

Mx 27.37.5.7.11 4.4 x 10° Mathieu (1861, 1873) 

Jo 27.37.57.7 6.0 x 10° Hall, Janko (1960s) 

Mp3 27.37.5.7.11.23 1.0 x 107 Mathieu (1861, 1873) 

HS 2?.32.53.7.11 4.4 x 107 Higman, Sims (1968) 

J 27.3°.5.17.19 5.0 x 10’ Janko, Higman, McKay (1960s) 

M4 210 33.5.7.11.23 2.4 x 108 Mathieu (1861, 1873) 

McL 27.36.53.7.11 9.0 x 108 McLaughlin (1969) 

He 210,33,52.73.17 4.0 x 10° Held, Higman, McKay (1960s) 

Ru 214 .33.53.7.13.29 1.5 x 10!! Rudvalis, Conway, Wales (1973) 

Suz 213,37.52?.7.11.13 4.5 x 10!! Suzuki (1969) 

O’N 2° .34.5.77.11.19.31 4.6 x 10!! O’Nan, Sims (1970s) 

Co3 210.37.53.7.11.23 5.0 x 101! Conway (1968) 

Co 218 36 53.7.11.23 4.2 x 10! Conway (1968) 

Fin 21732527, 1E.13 6.5 x 10% Fischer (1970s) 

HN 214 3°.5°.7.11.19 2.7 x 10!4 Harada, Norton, Smith (1975) 

Ly 28.37.56.7.11.31.37.67 5.2 x 10!6 Lyons, Sims (1972) 

Th 215,310 53 .72.13.19.31 9.1 x 10!6 Thompson, Smith (1975) 

Fiz 218,313 .52.7.11.13.17.23 4.1 x 10!8 Fischer (1970s) 

Co, 27! 3° .54.77.11.13.23 4.2 x 10!8 Conway, Leech (1968) 

Ja 2?! 33.5.7.113.23.29.31.37.43 8.7 x 10!9 Janko, Norton, Parker, Benson, 

Conway, Thankray (1970s) 

Fis, 27! 3'6 57.73 11.13.17.23.29 1.3 x 10% Fischer (1970s) 

B 241 323 56.7? 4.2 x 10 Fischer, Sims, Leon (1970s) 
11.13.17.19.23.31.47 

M 26:320 52:76 117.133 8.1 x 10° Fischer, Griess (1973, 1982) 


.17.19.23.29.31.41.47.59.71 


classes can be quickly determined through the formula ||K ¢|| = |IMIi/llCm(8)||. These 
centralisers play a large role in Section 7.3 below. 

The Monster M has a remarkably simple presentation. As with any noncyclic finite 
simple group, it is generated by its involutions (i.e. elements of order 2) and so is a 
homomorphic image of a Coxeter group (Definition 3.2.1) — see Question 7.1.1. 

Let Gopar, D=G =r = 2, be the graph consisting of three strands of lengths 
p+1,q+1,r + 1, sharing a common endpoint. Label the p +q +r + 1 nodes as 
in Figure 7.1 (this labelling is not standard). Given any graph G pqr, define Y „qr to be the 
group consisting of a generator for each node, obeying the usual Coxeter group relations, 
together with an additional one (what Conway calls the ‘spider relation’): 


(ab, boac\cpad\d>)'” = 1. (7.1.2) 


The relation (7.1.2) arises naturally in a generalisation of the Coxeter group due 
to Conway, called a fabulous group. Conway conjectured and, building on work by 


Table 7.2. The north-west corner of the Monster character table 


ch\K, 1A 2A 2B 3A 3B 3C 4A 4B 4C 4D 5A 5B 
Po 1 1 1 1 1 1 1 1 1 1 1 1 
pı 196883 4371 275 782 53 —1 275 51 19 243 133 8 
m2 21296876 91884  —2324 7889 -130 248 77 52. 20 12 626 1 
ps 842609326 1139374 12974 55912 —221 —248 8878 782 -82 78 2451 —49 
pa 18538750076 8507516 123004 249458 1598 248 28796 2652 380 156 6326 76 
ps 19360062527 9362495 —58305 297482 1508 —247 35903 —833 63  —65 8152 27 
Ps 293553734298 53981850 98970 1055310 —3927 3876 94874 1274 —102 —454 17423 -77 
p7 3879214937598 337044990 —690690 4751823 —4173 —3876 345598 —3874 —258 286 54473 98 
ps 36173193327999 1354188159 2864511 12616074 18954 0 701823 20383 —897 351 91124 —126 
fo 125510727015275 3215883115 1219435 24688454 —25375 248 1223531 19499 —661 —1365 145275 —350 
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Fig. 7.1 The graph G55; presenting the Bimonster. 


Ivanov [311], Norton proved [451] that Y555 = Y444 is the Bimonster, the wreathed-square 
M? Z2 = (M x M).2 of the Monster (in fact it is a semi-direct product (M x M)x Zz). 
We define the wreath product in Question 7.1.2; the wreathed-square M ? Z2 has G = M 
and H = S = Z2, where H acts on S by group multiplication. The group-theoretic 
significance of the wreath product is that any group G containing a normal subgroup 
N with quotient G/N = H can be identified with a subgroup of N : H with S = H. 
Thus any extension M.2 of Z by M is a subgroup of the Bimonster. The Bimonster 
appears naturally in Section 7.3.9. A closely related presentation of the Bimonster has 26 
involutions as generators and has relations given by the incidence graph of the projective 
plane of order 3; the Monster itself arises from 21 involutions and the affine plane of 
order 3. See [112] for details. 

The groups Y,,,, for p < 5, have now all been identified — see [312] for a unified 
treatment. The ones involving sporadic groups are 


Y553 =Y443 = M x Zp, 

Y533 = Y433 = Z2 x (2.B), 
Ta Yay 23. Fi), 
Y532 = Ya32 = Zz x Fiz, 


Y339 =Z: x (2.Fin9). 


The Coxeter groups of the graphs G555, 0553, Gs33, Gss2 and Gs32 are all infinite groups 
of hyperbolic reflections in, for example, R!™!, and contain copies of groups such as the 
affine Eg Weyl group, so there should be rich geometry here. 

What role, if any, these remarkable presentations have in Monstrous Moonshine hasn’t 
been established yet. As a first step though, [424] has found in the automorphism group 
of the Moonshine module V * the 21 involutions generating M. Perhaps this can simplify 
the hardest part of [201] (see Section 7.2.1 below). Indeed, Miyamoto’s simplified con- 
struction [427] of V’ and proof that Aut(V") = M uses Ivanov’s characterisation [311] 
of M. There is a correspondence [425] between certain involutions of a VOA VY (e.g. 
class 2A in M for V’) and certain vertex operator subalgebras of V isomorphic to the 
unique c = 1/2 rational VOA (the Ising model of Section 4.3.2); this technical tool has 
many applications, for example the association of various vertex operator superalgebras 
to V4, and the VOA interpretation of McKay’s Eg“! observation in Section 7.3.6. 
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7.1.2 Conway and Norton’s fundamental conjecture 


As mentioned in the introductory chapter, the central structure in the attempt to under- 
stand equations (0.2.1) is an infinite-dimensional graded module for the Monster, 
V =V Vi V2 @--- , with graded dimension J(t) = j(t) — 744 (see (0.3.2)). If 
we let p4 denote the dth smallest irreducible M-module, numbered as in Table 7.2, then 
the first few subspaces will be Vo = po, Vi = {0}, V2 = po ® p1, V3 = po ® p1 ® o2 and 
V4 = po ® po ® pı ® p1 ® (2 ® p3. As we know from Section 1.1.3, a dimension can 
(and should) be twisted, by replacing it with the character. This gives us the graded traces 


T(t) := chy_,(g)q7' + È chv,(8)q", (7.1.3) 


n=1 


called the McKay-Thompson series for this module V . Of course, T, = J. 


Conjecture 7.1.1 (Conway—Norton [111]) There exists a graded M-module V such 
that, for each element g of the Monster M, the McKay-Thompson series T, is the 
Hauptmodul 


[o0] 
Jr) =q +) an(g)q" (7.1.4) 
n=1 
for a genus-0 group 1’, of Moonshine-type. These groups each contain To(N ) as a normal 
subgroup, for some N dividing o(g) gcd(24, o(g)), and the quotient group T / To(N ) has 
exponent < 2. 


So for each n the map g + a(g) is a character chy,(g) of M. The quantity o(g) is the 
order of g. We defined the groups of Moonshine-type in Definition 2.2.4 and To(N) 
in (2.2.4b). By the exponent of a group we mean the smallest positive m such that 
h” = 1 for all h in the group. [111] explicitly identify each of the groups I,. The first 
50 coefficients a,(g) of each T, are given in [413]. Together with the recursions given in 
Section 7.1.4 below, this allows one to effectively compute arbitrarily many coefficients 
an(g) of the Hauptmoduls. It is also this that uniquely defines V , up to equivalence, as a 
graded M-module. 

There are around 8 x 105 elements in the Monster, so naively we may expect about 
8 x 10°? different Hauptmoduls T,. However, a character evaluated at g and at hgh™! 
will always be equal, so Tg = Trgn-1. Hence there can be at most 194 distinct T, (one for 
each conjugacy class). All coefficients a,(g) are integers (as are in fact most entries of 
the character table of Ml). This implies that T = Ta whenever the cyclic subgroups (g) 
and (h) are equal (why?). In fact, the total number of distinct McKay—Thompson series 
T, arising in Monstrous Moonshine turns out to be only 171. 

Of those many redundancies among the T,, only one is unexpected (and unexplained): 
the McKay—Thompson series of two unrelated classes of order 27, namely 27A and 27B, 
are equal. It would be interesting to understand what general phenomenon (if any) is 
responsible for T>7,(t) = Tr7p(T). But as we know from Section 5.3.3, the McKay— 
Thompson series T,(t) are actually specialisations of 1-point functions and as such 
are functions of not only t but of all M-invariant vectors v in V*. What we call T,(z) 
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is really the specialisation T,(t, 1) of this function 7, (zt, v). All 194 T, (one for each 
conjugacy class) will be linearly independent, if we include this v € (V")™ dependence. 
Thus the equality T274(T) = Tr7p(t) should be regarded as an accidental redundancy 
caused by specialisation, and is not of any deep significance. Plenty of other Norton’s 
series Nig ,)(t) (Section 7.3.2) will likewise be accidentally equal. Modular aspects of 
the 1-point functions T,(T, v) are studied in [155]. 

Recall that there are two different conjugacy classes of order 2 elements: 2A and 2B. 
Class 2B corresponds to I'9(2) and gives the Hauptmodul J in (2.2.17a), while class 2A 
corresponds to I'9(2)+, where for any prime p we define 


1/0 -1 
r = (Top), — i 7.1.5 
o(p)+ ( o(p) e i )) (7.1.5) 


Similarly, (2.2.17b) corresponds to an order 13 element in M, but J25 in (2.2.17c) 
doesn’t equal any T,. Recall that there are exactly 616 Hauptmoduls of Moonshine- 
type with integer coefficients [121], so most of these don’t arise as T,. Recently [110], a 
fairly simple characterisation has been found of the groups arising as I", in Monstrous 
Moonshine: 


Proposition 7.1.2 [110] A subgroup G of SL2(R) equals one of the modular groups 
I, appearing in Conjecture 7.1.1, iff: 
(i) G is genus 0; 
(ii) G has the form ‘To(n||h) +e, f, g,...73 
(iii) the quotient of G by Yo(nh) is a group of exponent < 2; and 
(iv) each cusp Q U iœ can be mapped to iœ by an element of SL2(R) that conjugates 
the group to one containing Vo(nh). 


The notation in (ii) is a little too technical to explain here, but it is given in [111] or [110]. 
We now understand the significance, in the VOA or CFT framework, of transformations 
in SL2(Z) (see especially Section 5.3.6), but (ii) emphasises that many modular trans- 
formations relevant to Moonshine are more general (called Atkin—Lehner involutions). 
Monstrous Moonshine will remain mysterious until we can understand its Atkin—Lehner 
symmetries. This isn’t a hopeless task — for example, [433] provides an early attempt at 
studying string theories with Atkin—Lehner symmetries, as well as its possible physical 
significance. Some of these involutions appear naturally in Weil’s Converse Theorem 
(see e.g. page 64 of [90]). Perhaps a topological interpretation for the groups I’, not con- 
tained in SL,(Z), in the spirit of Section 2.4.3, will help us understand their relevance 
in VOAs and the meaning of Atkin—Lehner involutions to CFT. This proposition is the 
answer to an important question, but unfortunately their proof of this characterisation is 
by exhaustion, and so by itself doesn’t contribute anything conceptually. 


7.1.3 Eg and the Leech 


There are other less important conjectures in [111]. We’ve already seen easy-to- 
understand relations of Eg and the Leech lattice A to the J-function: (0.5.1) (explained 
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in Section 3.2.3) and (0.5.2) (explained in Question 2.2.7). There is another way Eg and 
A can be related to modular functions. 

Lattices are related to groups through their automorphism groups, which are always 
finite for positive-definite lattices. The automorphism group Aut(A) = Coo of the Leech 
lattice has order about 8 x 10!8, and is a central extension by Zz of Conway’s simple 
group Co,. Several other sporadic groups are also involved in Cog, as we’ll see in 
Section 7.3.1. To each automorphism «œ € C0, let 0, denote the theta function of the 
sublattice of A fixed by a. Conway—Norton also associate with each automorphism œ 
a certain function 7_(t) of the form |]; n(ait)/ |] j n(bjT) built out of the Dedekind 
eta function (2.2.6b). Both 0, and ņa are constant on each conjugacy class in Coo, of 
which there are 167. [111] remarks that the ratio 0,/nq always seems to equal some 
McKay-—Thompson series Tq). 

It turns out that this observation isn’t quite correct [366]. For each automorphism 
a € Coo, the subgroup of SL2(R) that fixes 6,/nq is indeed always genus 0, but for 
exactly 15 conjugacy classes in Coo, 6./Nq is not the Hauptmodul. Nevertheless, this 
construction proved useful for establishing Moonshine for M24 [407]. 

Similarly, one can ask this for the Eg root lattice, whose automorphism group is the 
Weyl group of the Lie algebra Eg (of order 696 729 600). The automorphisms of the 
lattice Eg that yield a Hauptmodul were classified in [95]. On the other hand, Koike 
established a Moonshine of this kind for the groups PSL2(F7), PSL2(F5) = As and 
PSL2(F3), of order 168, 60 and 12, respectively [356]. 


7.1.4 Replicable functions 


A conjecture in [111] that played an important role in ultimately proving the main conjec- 
ture involves the replication formulae. Conway—Norton want to think of the Hauptmoduls 
T, as being intimately connected with M; if so, then the group structure of M should 
somehow directly relate different T,. Considering the power map g +> g” leads to the 
following. 

It was well known classically that J (t) (equivalently, j(t)) has the property that 


s(t) := J (p1) +J (<) +J (=) roe (==) (7.1.6a) 
p p p 


is a polynomial in J (t), for any prime p. The proof is straightforward, and is based on 
the principle that the easiest way to construct a function invariant with respect to some 
group G is by averaging it over the group: pare Sf (g.x). Here f(x) is J(pt) and G is 
SL2(Z), and we’ ll average over finitely many cosets rather than infinitely many elements. 
First, writing for SL2(Z), note that 


p 0 _(p 0 P= a i _ E 
ae mes OU i) P= (A © Maa) |det(A) = p) 
(7.1.6b) 


In Question 7.1.4 you show that this implies (7.1.6a) is a modular function for SL2(Z). 
Hence s(t) equals a rational function Q(/(t))/P(J(t)) of J (t), as in (0.1.7). Because 
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the only poles of J are at the cusps, the same applies to s(t). This implies that the 
denominator polynomial P (z) must be trivial (recall that J (H) = C). QED 

The map J(t) b> s(t) in (7.1.6a) is called a ‘Hecke operator’, and is an important 
ingredient of modular theory. More generally, the same argument says 


b 
me J(=} )= one, (7.1.7) 
ad=n,0<b<d 


where Q, is the unique polynomial for which Q,(J(t)) — g~" has a g-expansion with 
only strictly positive powers of q. For example, Q2(x) = x? — 2a; and Q3(x) = x? — 
3a,x — 3a2, where we write J (t) = A anq”. These equations (7.1.7) can be rewritten 
into recursions such as a4 = a3 + (a? — a;)/2, or collected together into the remarkable 
expression (3.4.7a). 

Conway and Norton conjectured that these formulae have an analogue for any McKay— 
Thompson series T,. In particular, (7.1.7) becomes 


at+b 
x Tga ( ) = On,g(T(T)), (7.1.8a) 


ad=n,0<b<d d 


where Q,,,. plays the same role for T, that Q, plays for J. For example, we get 
tT+1 


T,2(2t) + Ty (Z) +T ( ) = rE? ~ 210) 


1 2 
Tee) +1, (5) +7. (SS) +7 (SZ) ERO -Or - 3a 


These are called the replication formulae. Again, these yield recursions like a4(g) = 
a(g) + (a(g) — a,(g”))/2, or can be collected into the expression 


“l exp| — > Y amh 2 


k>0 m>0 
neZ 


= T,(z) — T(t). (7.1.8b) 


This looks a lot more complicated than (3.4.7a), but you can glimpse the Taylor expansion 
of log(1 — pq") there and in fact for g = e, (7.1.8b) reduces to (3.4.7a). 

Axiomatising (7.1.8a) leads to Conway and Norton’s notion of replicable function 
[449], [6]. 


Definition 7.1.3 Let f be any function of the form f(t) =q! + XL, bag", and 
write f) = f and bP = b,. Let Q, ¢ be the unique (degree n) polynomial such that 
the q-expansion of Qn, ¢(f(t)) — q™” has only positive powers of q. Use 


fo (“= +t) = Qn (FPO), (1.1.9) 


ad=n,0<b<d 


to recursively define each f™. If each f™ has a q-expansion of the form f(t) = 
q! +2, bq = that is, no fractional powers of q arise — then we call f 
replicable. 
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Proposition 7.1.4 [6] Suppose f is of the form f(t) = q7! + XL; anq”, and define 
Qn, ș as in Definition 7.1.3. Define Hmn by 


Qn ED =" + nhng". 


n=1 
Then f is replicable iff Ham = H, s holds whenever mn = rs and gcd(n, m) = gcd(r, s). 
The proof isn’t hard: if f is replicable, with replicates f = q7! + 7, at‘, then 


l a 
Hrn = inma 
d|ged(n,m) 
and the H,, » = H, s property is manifest. See Question 7.1.5 for the converse. 


Equation (7.1.8a) conjectures that the McKay—Thompson series are replicable. In 
particular, we have (T,)(t) = Tgn (T). [123] proved that the Hauptmodul of any genus- 
0 modular group of Moonshine-type is replicable, provided its coefficients are rational. 
Incidentally, if the coefficients bP are irrational, then Definition 7.1.3 should be modified 
to include Galois automorphisms (see section 8 of [114]). Replication in positive genus 
is discussed in [510]. 

Conversely, Norton has conjectured: 


Conjecture 7.1.5 Any replicable function with rational coefficients is either a Haupt- 
modul for a genus-0 modular group of Moonshine-type, or is one of the ‘modular 
fictions f(t) =q! = exp[—2rit], f(t) =q7! +q = 2cos[2rt], f(t) =q7! — 
q = —2isin[27T]. 


This conjecture seems difficult and is still open. 

As is manifest in (7.1.8a), replication concerns the power map g b> g” in M. Can 
Moonshine see more of the group structure of M? One step in this direction is explored 
in Section 7.3.6, where McKay models products of conjugacy classes using Coxeter— 
Dynkin diagrams. A different idea is given in Section 7.3.2. It would be very desirable 
to find other direct connections between the group operation in M and, for example, the 
McKay-Thompson series. 


Question 7.1.1. Let G be a finite simple group, and let K ¥ {e} be any nontrivial con- 
jugacy class. Prove that K generates G. Why is any noncyclic finite simple group a 
homomorphic image of a (possibly infinite) Coxeter group? 


Question 7.1.2. Let G, H be any groups, and S any finite set on which H acts. By the 
wreath product G ? H we mean the set of all pairs (f, A), where f is any function from 
S — G and h € H. Group multiplication is given by (f, AX f’, h) = ( f", hh’), where 
f": S — G is defined by f” (s) = f(s) f’(A7!.s). 

(a) Verify that G 2 H is a group. Compute its order. 

(b) Find a normal subgroup in G ? H, isomorphic to G x --- x G (||S|| times). Identify 
the quotient of G 2 H by this normal subgroup. 

(c) Find a subgroup of G 2 H isomorphic to H. 
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Question 7.1.3. Note that the dimensions 196 883 and 21 296 876 — see (0.2.1) — exactly 
divide the order of the Monster — see (0.2.2). Is this (1) merely a coincidence; (ii) a 
mysterious property of M perhaps relevant to Moonshine; or (iii) does it have a more 
mundane explanation? 


Question 7.1.4. Prove (7.1.6b). Use that to prove that the sum s(t) in (7.1.6a) is invariant 
under SL2(Z). 


Question 7.1.5. Complete the proof of Proposition 7.1.4. 


Question 7.1.6. Suppose f(t) = q7! + pie a,q* is a replicable Laurent polynomial. 
Prove that f is a modular fiction: f(t) = q7! or f(t) =q! q. 


Question 7.1.7. As we know from Section 3.2.3, j 3 is the graded dimension of the E‘- 
module L(wo). Thus j is the graded dimension of L (wo) ® L (wo) ® L(wo), on which the 
Lie group (Eg(C) x Eg(C) x Eg(C))xS3 acts. Explain why L(wo) @ L(wo) @ L(ao) 
cannot be the M-module V whose graded characters (7.1.3) are the McKay—Thompson 
series (ignoring the irrelevant constant 744). 


7.2 Proof of the Monstrous Moonshine conjectures 


At first glance, any deep significance to the Moonshine conjectures seems very unlikely: 
they constitute after all a finite set of very specialised coincidences. The whole point 
though is to try to understand why such seemingly incomparable objects as the Monster 
and the Hauptmoduls can be so related, and to try to extend and apply this understand- 
ing to other contexts. Establishing the truth (or falsity) of the conjectures was merely 
meant as an aid to uncovering the meaning of Monstrous Moonshine. Indeed, in proving 
them, important new algebraic structures were formulated. We sketch this proof in this 
section. 

The main Conway—Norton conjecture was attacked almost immediately. Thompson 
showed [524] (see also [476]) that if g b> a,(g) is a character for all sufficiently small n 
(apparently n < 1300 is sufficient), then it will be for all n. He also showed that if certain 
congruence conditions hold for a certain number of a,,(g) (all with n < 100), then all 
g > a,(g) will be virtual characters (i.e. differences of true characters of M). Atkin, 
Fong and Smith (see [511] for details) used that and a computer to prove that indeed 
all a,(g) were virtual characters (they didn’t quite get to n = 1300 though). But their 
work doesn’t say anything more about the underlying (possibly virtual) representation 
V, other than its existence, and so adds no light to Moonshine. It plays no role in the 
following. 

We want to prove Conjecture 7.1.1, that is, show that the McKay—Thompson series 
T(t) of (7.1.3) equals the Hauptmodul Jr, (7) in (7.1.4). First, we need to construct 
the infinite-dimensional module V of M. This we discuss in Section 7.2.1. Borcherds’ 
strategy was to bring in Lie theory, by associating with the module V a ‘Monster Lie 
algebra’. This example of a Borcherds—Kac—Moody algebra is described in Section 7.2.2. 
Next, we go from the Monster Lie algebra to the replication formula, and conclude the 
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Table 7.3. The first few homogeneous spaces of 
the Moonshine module V" 


M-module 
Vo Po 
v. 0 
Vy Po È pi 
V; Po ® P1 ® pro 
Vy 2p0 B21 ® p2 ® p3 
V; 200 ® 3p1 ® 202 ® p3 ® ps 
Ve 4p ® 5p1 ® 302 ® 2p3 ® p4 ® ps ® p6 
V3 400 ® 7p: ® 502 ® 303 © p4 ® 3ps ® p6 ® p7 
Vg 7p ® 11p1 ® 70 ® 6p3 ® 3p4 ® 4ps ® 2p6 ® 2p7 @ Pps 


proof. In the final subsection, we explain the need for a second proof, and suggest what 
it may involve. 

Thanks largely to Borcherds, the Monstrous Moonshine conjectures opened a door to 
mathematical riches far beyond what Conway and Norton could have originally hoped. 
For his work in Monstrous Moonshine and related topics, Richard Borcherds was awarded 
the Fields medal in 1998. 


7.2.1 The Moonshine module V’ 


The first essential step in the proof of the Monstrous Moonshine conjectures was the 
construction by Frenkel-Lepowsky—Meurman [200] of a graded infinite-dimensional 
representation V” of M. They conjectured (correctly) that it is the representation V in 
(0.3.1). As we know, V “has a very rich algebraic structure: it is in facta VOA. A somewhat 
simpler construction of V’ is now available [427]; in particular, the fundamental fact 
that Aut(V*) = M seems much clearer. 

Each homogeneous space V} of V” is a finite-dimensional M-module — see Table 7.3. 
Being a finite group, M only has finitely many (in fact exactly 194) irreducible represen- 
tations, whereas J (t) has infinitely many coefficients a,, which grow polynomially with 
n. As can already be observed in the table, the decompositions of V,; into irreducible 
M-modules become increasingly complicated, with ever-increasing multiplicities. Thus 
the fact that 196 884 almost equals 196 883 is of no special significance, other than that 
it made it easier to anticipate that j and M are related. 

Now, V® was constructed before VOAs had been defined. It was natural for Frenkel- 
Lepowsky—Meurman to use vertex operators to try to construct the M-module V of 
(0.3.1), because there were already vertex operator constructions associated with lat- 
tices, affine algebra modules and string theory, and all of these have connections to 
modular functions. Borcherds’ definition [68] of vertex algebras abstracted out alge- 
braic properties of V’ as well as those older vertex operator constructions. 
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As we discuss in Sections 4.3.4 and 5.3.6, the Moonshine module V ' was constructed 
as the orbifold of the Leech lattice VOA V(A) by the + 1-symmetry of A — more precisely, 
by an involution in Aut(V(A)) restricting to the automorphism —1 of A. This orbifold 
construction implies that V” is the direct sum of an invariant part Ve = V(A)I, and a 
twisted part Vi := V(A);! (recall (4.3.16)). The underlying vector spaces can be (and 
usually are) chosen to be real, and in fact later we speculate that they can be taken to be 
Z-modules (Conjecture 7.3.3). 

The orbifold serves two purposes. First, it removes the constant term ‘24’ from the 
graded dimension J + 24 of V(A). This means that the Lie algebra Vi vanishes, giving 
V! a chance to have a finite automorphism group (Section 5.2.1). Second, this orbifold 
construction enhances the symmetry from the discrete part of Aut(V(A)), which is an 
extension of Coo by (Z2)*4, to all of M. In particular, that discrete part of Aut(V(A)) 
preserves the decomposition V 1 = Ve ® VŽ andis isomorphic to the centraliser Cm(2B). 
An additional automorphism of V”, an involution o mixing Vj and related to ‘triality’, 
was constructed by hand. A theorem of Griess [263] shows that together they generate 
M. See [201] for more details. Establishing this symmetry enhancement is the most 
difficult part of [201]. 

A major claim of [201] is that V’ is a ‘natural’ structure (hence their notation). This 
has been uncontested. We have Vv = Cl, as usual, and vi = 0. Hence the space V5 will 
be a commutative non-associative algebra with product u x v := u,v and identity lo 
(Question 5.2.3). In fact, it is the 196 883-dimensional Griess algebra [263] extended 
by an identity element, which is known to have automorphism group exactly M [528]. 
Using this, the automorphism group of V ? can be seen to equal the Monster M. The only 
irreducible module for V° is itself — such a VOA is called holomorphic (Section 5.3.1). 
Together with Zhu’s Theorem 5.3.8, this implies that its graded dimension must be a 
modular function for SL2(Z), and in fact j(t) — 744 (Question 5.3.4). 

All arguments relating V” to M are complicated by the bipartite structure V$ 
V€, In particular, not all elements of Aut(V *) are equally accessible. For example, [201] 
could prove Conjecture 7.1.1 when g € M preserves V£ — equivalently, for any g € M 
commuting with some element in class 2B — but not for the other g € M. Perhaps the 
work of [424] will make the Monster’s action on V’ more uniformly accessible. 

Conjecturally, there are 71 holomorphic VOAs with central charge c = 24 [488]. 
Recall that the Leech lattice A is the unique even self-dual positive-definite lattice of 
dimension 24 containing no norm-squared 2-vectors [113]. Under the lattice<+VOA 
correspondence mentioned at the end of Section 5.2.2, we are led to the following: 


built into 


Conjecture 7.2.1 [201] The Moonshine module V” is the unique holomorphic VOA V 
with central charge c = 24 and with trivial V\. 


Proving Conjecture 7.2.1 is one of the most important and difficult challenges in the 
subject — the first small step towards this is [146]. If true, as is expected, it would tell 
us V’ is a fundamental exceptional structure, on par with the Leech lattice or the Eg 
Lie algebra or indeed the Monster M. We return to this conjecture in Section 7.3.4; the 
analogue A7” for vertex operator superalgebras (holomorphic, c = 12 and Vj/. = 0) is 
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known and has automorphism group Co, [163]. Although the theta series ©; usually 
doesn’t determine the lattice, A is the unique lattice with theta series ©, (this follows 
quickly from its above-mentioned uniqueness). It is thus tempting to also conjecture that 
the Moonshine module is the unique VOA with graded dimension J (see Question 7.2.7). 


7.2.2 The Monster Lie algebra m 


It was discovered early on that every Hauptmodul is replicable, and moreover that any 
replicable function is determined by its first few coefficients. An obvious approach to 
Conjecture 7.1.1 then is to show that the McKay—Thompson series T, are also replica- 
ble. To get the necessary identities satisfied by their g-expansions, Borcherds used the 
denominator identity (Section 3.4.2) of a Lie algebra he associated with V”. 

We want to construct a Lie algebra m from the Moonshine module V’ = Vo D Ve D 
- - - . Of course, the direct choice yi is 0-dimensional, so we must modify V' first. Recall 
from Section 5.2.2 that a near-VOA V(L) is associated with any even indefinite lattice 
L. Let Vi ıı := V(I 1,1) be the near-VOA associated with the two-dimensional even self- 
dual indefinite lattice Z 71,ı defined in Section 1.2.1. We take both V! and Y,,1 to be real. 
Define V to be the near-VOA V* & V1. As we know, the Monster M acts on Vt; extend 
this action to V by defining M to fix V;,;. An invariant positive-definite bilinear form on 
V* is constructed in [201]; extend it to V in the obvious way. Then the resulting form 
(x|*) is M-invariant. 

The Monster Lie algebra m is the quotient of PY, by the radical of the form («|*) 
on VY, where the spaces PVY, are defined in (5.2.3). The radical contains PVo, so m has 
a natural (real) Lie algebra structure (see Question 7.2.4). From the V;,ı part of V we 
get the involution w and Z-grading (see e.g. section 6.2 of [323] for details). Then by 
Theorem 3.3.6, a certain central extension of m is some universal Borcherds—Kac— 
Moody algebra 9(A) — see [72], [323] for details. More precisely, its Cartangx y 
matrix 


De Qa Qe. “et =i 
Or =) es e3 3 

A= : (7.2.1) 
0 -2 e 3 =3 


consists for each i, j € {—1, 1, 2, 3, ...} of a block in the (i, j) spot of size a; x aj and 


with entries —(i + j), where a; are the coefficients J (t) = $<; aig’. 


Theorem 7.2.2 The Monster Lie algebra ism = §(A)/c, where A is in (7.2.1) and cis the 
(infinite-dimensional) centre of §(A). m has Cartan subalgebra R ®z Ih 1 =R R =: 
mo,o and simple roots dix for each i € {—1, 1,2,3,...} and 1 < k < ai. Only &—1,1 
is real. The root-space decomposition of m is m = D=- Mi, j) The Monster M acts 
on m as Lie algebra automorphisms. Each root space ma, j) (for (i, j) # (0, 0)) is an 
M-module isomorphic to the homogeneous space (V ®)ij+1, while the Cartan subalgebra 
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mo,o = Po ® Po as an m-module. The denominator identity of m is given in (3.4.7a). 
Finally, m has a vector-space decomposition u* ® gl, ® uw" into a sum of Lie subalge- 
bras, where u~ are free Lie algebras with countably many generators. 


The proof is given explicitly in section 6.2 of [323], and involves the No-Ghost Theorem 
(see the appendix of [323]) —a result first proved in string theory and special to VOAs with 
central charge c = 24. In particular, the No-Ghost Theorem establishes the M-module 
isomorphisms in Theorem 7.2.2. m has only one positive real root, so its Weyl group is 
order 2 and sends (i, j) to (j, i); it is responsible for the difference on the right side of 
(3.4.7a) (the j-function is the correction due to imaginary simple roots). The positive 
roots are (—1, 1) and the a;; of type (i, j), and this gives the product on the left. 

Similarly, the fake Monster Lie algebra is associated in the same way with the near- 
VOA V(A) ® Y;,1. Though it is certainly an interesting example of a Borcherds—Kac— 
Moody algebra, it plays no role in the theory. Its name arose because it was initially 
suspected as playing a role in the Moonshine proof, but like V(A) doesn’t carry a natural 
action of M so was discarded. 

This construction of m from V" may seem indirect. An alternate approach uses Moon- 
shine cohomology [386] — a functor assigning to certain c = 2 near-VOAs a Lie algebra 
carrying an action of M. To Vı,ı this functor assigns m. This functor was anticipated in 
[72] and [73] and was inspired by BRST (‘Becchi—Rouet-Stora—Tyutin’) cohomology 
in string theory, or the semi-infinite cohomology of Lie theory. In particular, the standard 
method for obtaining the space of physical states in a string theory involves tensoring the 
original space H (a CFT with c = 26) with a space Hghrosts of ghosts (with c = —26); on 
H ® Hgnosts is an operator Q obeying Q? = 0, and the space H prys of physical states is 
the cohomology H* = ker Q /im Q. In particular, m is the space H! for H = V’ @ Vi 1. 
The Baby Monster Lie algebra [72], which plays the same role for B as m plays for M, 
can be obtained in a similar way [290]. 

Because of a cohomological interpretation of denominator identities valid for any 
Borcherds—Kac—Moody algebra, (3.4.7a) can be ‘twisted’ by any g € M. This is how 
Borcherds derived (7.1.8b). These formulae are equivalent to the replication formulae 


(7.1.8a) conjectured in Section 7.1.4. However, these identities are obtained by more 
elementary means — requiring less of the theory of Borcherds—Kac—Moody algebras — in 
[324], [331], permitting a simplification of Borcherds’ proof at this stage. In particular, 
in [324] the replication formulae (7.1.8a) appear quite naturally because u~ are free Lie 
algebras. 


7.2.3 The algebraic meaning of genus 0 


Now, it turns out that if we verify for each conjugacy class K, of M that the first, 
second, third, fourth and sixth coefficients of the McKay—Thompson series T, and the 
corresponding Hauptmodul Jr, agree, then T, = Jr,. That is precisely what Borcherds 
then did: he compared finitely many coefficients, and as they all equal what they should, 
this concluded the proof of Monstrous Moonshine! 
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However, this case-by-case verification occurred at the critical point where the 
McKay—Thompson series were being compared directly to the Hauptmoduls, and so 
provides little insight into why the T, are genus 0. Recall that the main purpose for the 
proof of Conjecture 7.1.1 was not to establish its logical validity — the numerical evidence 
was already quite strong. Rather, the proof is supposed to help us understand how the 
Monster could be related to Hauptmoduls. This case-by-case verification became known 
as the conceptual gap. The basic problem is that V", m and (7.1.8b) are algebraic, and the 
genus-0 property is topological. Fortunately, a more conceptual explanation of the equal- 
ity T, = Jr, — a conversion of the Hauptmodul property into an algebraic statement — 
has been found [122], replacing Borcherds’ coefficient check with a general theorem. 

Let p be prime. Exactly as in the argument of (7.1.6a), we find that the quantity 


k k k 
Jr ts (=) +1() tgs (=) (7.2.2a) 


is a degree-pk polynomial in J (t). This uses the Hauptmodul property of J. Thus there 
is a polynomial F,(X, Y), of degree p in both X and Y , defined by 


p-l . 
F,(X, J(t)) = X — (pt) || (x = J (C ` ! )) (7.2.2b) 


i=0 


Indeed, the coefficients of F(X, J (t)) are symmetric polynomials in the roots J (pt), 
J (=), and so can be expressed polynomially using (7.2.2a). For example, 


F(X,Y) = (X? — Y (Y? — X)— 393768 (X? + Y?) — 42987520 XY 
— 40491318744 (X + Y) + 120981708338256. 


Definition 7.2.3 Consider a formal series f(t) = q7! + X, baq” (‘formal’ means 
we don’t worry about whether it converges). An order-n modular equation for f is a 
monic polynomial F(x, y) in two variables, of degree y(n) := n J [primes pal + 1/p), 


such that 
F, (rŒ. f (= 2) -0 


for all integers a, b, d > 0 such that ad = n, gcd(a, b, d) = 1 and0 < b < d. 


This definition looks a little obscure, but it is natural. The degree y(n) is precisely the 
number of those triples (a, b, d). These triples come from the coset expansion 


b 
rok) (o 9) oU (G 4) Pate 


a,b,d 


for any K obeying n = 1 (mod K). Modular equations necessarily obey F(x, y) = 
+F,,(y, x). 

Thus J (t) obeys a modular equation for all n. Note that this property depends crucially 
on it being a Hauptmodul. Conversely, does the existence of modular equations imply 
the Hauptmodul property? Unfortunately not: the exponential function f(t) = q7! also 
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obeys one for every n. For example, for p prime, take F’,(x, y) = (x? — y)(x — y”) (see 
also Question 7.2.5). 

Beautiful and unexpected is that the only functions f(t) = q7! + big +--+ to obey 
modular equations for all n are J (t) and the ‘modular fictions’ q7! and q7! + q (which 


are essentially exp, cos and sin) [360]. More generally, we have the following: 


Theorem 7.2.4 [122] Let f(t) be a formal series q7! + Y2; bnq”, bi € C. Suppose 
f satisfies a modular equation of order n for all n = 1 (mod N). Then: 
(a) f converges to a holomorphic function on H. 
(b) If the symmetry group V(f) := {a € SL.(R)| f(a.t) = f(t)} consists only of the 
t 
O 1 
coefficient £ is an algebraic number, then £ = 0 or 8°%°4N) = 1, 
(c) If the symmetry group T (f) does not only contain translations, then T (f) is genus 0 
and f is a Hauptmodul for T (f). Moreover, T (f) contains some subgroup To(K), 
for K|N®”. 


translations + ) then f(t) = q7! + £q for some coefficient £ € C; if the 


Conversely, if f is a Hauptmodul for some subgroup F of SL2(R) containing ro(K), and 
all coefficients b; lie in the cyclotomic field Q[Ex ], then f obeys a modular equation for 
every n = | (mod K). For the other n coprime to K, there is also a modular equation 
involving twisting by the Galois group, as in (2.3.14). See [122] for details. The condition 
K|N° means all primes dividing K also divide N. 

The denominator identity argument tells us each T, obeys a modular equation for 
each n = | modulo the order N = o(g) of g, so Theorem 7.2.4 concludes the proof of 
Monstrous Moonshine, and replaces Borcherds’ coefficient check. 

The proof of Theorem 7.2.4 is difficult. First, it is established that f is holomorphic 
on H. This implies that whenever f(t,) = f (t2), there is a diffeomorphism @ defined 
locally about t1, such that a(t) = t2 and f(a(t)) = f(t). The hard part of the proof is 
to show @ extends to all of H. Once that is done, we know a is a Möbius transformation, 
and the rest of the argument is reasonably straightforward. 

In [120] it is shown that if f obeys a modular equation for any n, all of whose prime 
divisors are congruent to 1 (mod N), then either f = q + £q7! for some £, or f is the 
Hauptmodul for a group containing some T(N’). However, computer calculations by 
[102] indicate that the hypothesis of these theorems can be considerably weakened: 


Conjecture 7.2.5 [102], [120] Let f(t) =q7'+ X; bag” be a formal series and 
p, p' any two distinct primes. If f satisfies modular equations for both p and p’, 
then f converges in H to a holomorphic function, and either f(t) = q~'! + £q for 
Esed(p—Lp'—D+l — £, or f is the Hauptmodul for a genus-0 group containing T(N) for 
N coprime to pp’. 


This conjecture is completely out of reach at present. 

Finding modular equations was a passion of Ramanujan, who filled his notebooks 
with them. See [82] for an application of Ramanujan’s modular equations (namely, for 
the function p(t) = “log n(T)) to computing the first billion or so digits of 7. 
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In Section 1.7.2 we show that although radicals can be used to solve (i.e. find closed 
expressions for the roots of) arbitrary polynomials of degree 4 or less, they are inadequate 
to solve all polynomials of degree 5 or higher. However, much as the relation cos(3@) = 
4cos(0)° — 3 cos(6) yields the solution to cubics, a modular equation relating t and 5t 
for ./63/7 can be used to solve quintic polynomials (see e.g. chapter 7 of [464]). 

Many of the applications of the j-function have to do with its modular equations. For 
instance, recall from Theorem 1.7.1 that each abelian extension of Q lies inside some 
cyclotomic field Q[é,], in other words is generated by the values of the exponential 
function exp[2z ia] when « is rational. Likewise, the abelian extensions of the imaginary 
quadratic fields Q[./—d] are generated by the values of J (t) for special t. See [117] for 
a review of this part of what is called class field theory. Modular equations are used to 
establish properties of those special values of J (t) (see Question 7.2.2). 

Generalising a little a definition of McKay (recall Conjecture 7.1.5), we get: 


Definition 7.2.6 By a modular fiction we mean any function of the form f(t) = 
q7! + £q, where either £ = 0 or £” = 1. 


The point is that these behave like the modular functions T, — more precisely [122], 
these are precisely the non-Hauptmoduls with cyclotomic integer coefficients, which 
obey (Galois-twisted) modular equations for each n (see [122] for more details). Perhaps, 
exceptional though they are, they shouldn’t be ignored. This suggests the following: 


Problem What is the VOA-related question, for which ‘24’ is the answer? 


More precisely, out of which VOA-like structure can we obtain the modular fictions, in 
a way analogous to how the T, are obtained from V”? That structure would complete 
Moonshine for the modular fictions. Incidentally, it is manifest in the proof of Theo- 
rem 7.2.4 that this 24 arises there through the usual exponent-2 property of Section 2.5.1. 


7.2.4 Braided #7: speculations on a second proof 


Monstrous Moonshine began with the challenge to understand how the Monster (the 
right side of (7.1.1)) could be related conceptually to modular functions (the left side of 
(7.1.1)). We have seen that VOAs constitute a bridge between the two sides: the Monster 
is the symmetry of a VOA V” whose graded dimension is the J -function. 

That argument is still the only proof we have of Monstrous Moonshine. But does that 
put our finger on the essence of the mystery? The indirect argument sketched in the 
previous three subsections leaves the special role of the Monster unclear. As we’ll see 
shortly, it also ignores what CFT has tried to teach us regarding modularity. It should 
also be remarked that a VOA is quite a complicated beast — do we really need all of its 
rich structure, if all we care about is Moonshine? Is there a simpler explanation that, 
by requiring less machinery, is both more general and more conceptual and that more 
directly connects M to a Hauptmodul property? 

For these reasons, we should look for a second proof of Monstrous Moonshine. But 
what would it look like? To get a hint, let’s recall the CFT explanation of modularity. 
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Two essentially equivalent formulations of quantum field theory are: 


(i) The Hamiltonian formulation (canonical quantisation), which presents us with a 
state space V , carrying a representation of the symmetry algebra of the theory, and 
includes among other things a Hamiltonian (energy operator) H . 

(ii) The Feynman formulation, which interprets the amplitudes using path integrals. 


In RCFT, the Hamiltonian formulation describes concretely the space V , graded by H , on 
which we take the trace tryq” , and hence gives us the coefficients of our g-expansions. 
The Feynman path formalism, on the other hand, interprets these graded traces as func- 
tions over moduli spaces, and hence makes their modularity manifest. According to 
RCFT, the modularity in Moonshine is the conjunction of these two formulations. 

On the Hamiltonian side of CFT, the space V is a module for the chiral algebra (VOA) 
Y. As such, it is a module of the Virasoro algebra (3.1.5) (giving us the Hamiltonian 
H = Lo), as well as possibly other algebras (e.g. Kac-Moody) and groups (e.g. M). In our 
hypothetical second proof, we would like to avoid the full VOA structure, but probably 
the presence of Yir is fundamental if we want to give meaning to the coefficients in 
the g-expansion, that is the grading of the modules. Thanks to the theory of VOAs, we 
understand fairly well the Virasoro side. The remainder of this subsection will be devoted 
to the more mysterious question: what is the key ingredient of the Feynman side? 

In any treatment of RCFT (e.g. [436], [207], [131], [530], [32]), we read that V- 
characters (5.3.13) are ‘l-point functions on the torus’. By this is meant that they are 
chiral blocks in By D for the torus with one marked point, with that point labelled 
with the ‘vacuum’ module VY itself (see e.g. Sections 4.3.3 and 5.3.4 for the physical 
description). Verlinde’s formula tells us that space has dimension equal to the number 
of irreducible modules M of the chiral algebra V, and indeed the characters xy form a 
natural basis for it. As explained in Section 2.1.4, its (enhanced) mapping class group 
Tit is the braid group 63. Thus 53 will act on the characters of the RCFT. From this, 
using (1.1.10a), we obtain the action of the modular group SL2(Z). 

To see this 63 action explicitly, we have to undo a simplification we performed in 
Definition 5.3.6. The 1-point functions xm are actually functions of the triple (t, v, z), 
where t lies in the Teichmüller space H of the torus with 1 puncture, v € V is the 
insertion state and z € C is a local coordinate at the puncture. Explicitly, as explain in 
Section 5.3.4, for v € Vig we get 


xm (Ct, v, Z) = try Y (v, e277) qT = eH kt try o(y) ho, (7.2.3a) 


using the notation of Section 5.3.3 (compare with (5.3.13)). The group Tii is (like any 
mapping class group) generated by the Dehn twists, and as mentioned we obtain 


Py. = (01, 02 | 010201 = 020102) = B3, (7.2.3b) 
where o; are the Dehn twists of Figure 2.8. The action of o; on the characters is then 
o1.XmM(t, v, z) =e CM? yy(t +1, v, 2), (7.2.4a) 


— „—2rik/12 T v 
02.XM(T, v,z) =e XM (= ar :) ; (7.2.4b) 
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so in particular we get 


~2nik/2 y y(t, (—1)*v, z), (1.2.40) 
=2rik y y(t, v, Z). (7.2.4d) 


(010701)".Xu(T, v, z) =e 
(o10201)".xu(T, v, z) =e 


The combination (o,070,)*, which is trivial in the (unenhanced) mapping class group 
T11 = SL2(Z), here equals the Dehn twist about the puncture, which for the logarithmic 
parameter z of course sends z to z + 1. The actions of g; on t and v are determined from 
the homomorphism 63 — SL2(Z) given by (1.1.11b) at w = —1. This should be very 
reminiscent of Section 2.4.3. 

Of course here the state v comes from the vacuum sector VY so the conformal weight 
k is an integer. We are in the situation of Section 2.4.3, where our 43 action collapses 
to one of PSL2(Z), since the centre (7.2.4c) acts trivially. This is why the z-dependence 
of xm could be safely ignored in Definition 5.3.6. As before, the more interesting case 
is when the weight & of the modular form is not integral. Here, that will happen when 
we insert states from other V-modules, that is when we consider chiral blocks from the 
other BY D In CFT these are equally fundamental. In this case, v € M will have rational 
conformal weight k € hm + N, and here the Dehn twist about the puncture will typically 
not act trivially. As happened with the Dedekind eta function in (2.4.14), we will then 
see nontrivial 8; actions? (involving e.g. the Sia) of Figure 6.4). 

It should be clear that in RCFT, modularity is a topological effect. Zhu’s Theorem 5.3.8 
generalises the appearance of SL2(Z) in RCFT to any RVOA, but as we recall from 
Section 5.3.5, the proof follows closely the intuition of RCFT: modularity in VOAs 
arises through that SL,(Z) action on the space of chiral blocks, which is inherited from 
the topological Ti ı-action mentioned above, once we drop (as Zhu did) the dependence 
onz. 

A toy model of this idea is provided by the proof in Section 2.4.2 of the modularity 
of 03: we can interpret this action of SL2(Z) as an action of 63. Note that this action of 
SL,(Z) on the Heisenberg group H is really the action of B3 on the group R? given in 
(2.4.15b); it factors through to SL2(Z) because R? is abelian. 

The relation of the Hamiltonian (Wir) side to that of Feynman (3) is that the Virasoro 
algebra acts naturally on the enhanced moduli space M 1,1 (see Section 3.1.2), whose 
mapping class group is 63. This Yir-action leads to the KZ equations, which are partial 
differential equations obeyed by the chiral blocks in Bl D that is by the VOA characters. 
The monodromy group of those equations is Dai = B3, and thus 53 acts on By? ; 

Of course the reason Borcherds chose a different route in [72] is that we need more than 
merely modularity: we need the genus-0 property. But as we will see in Section 7.3.3, 
Norton has proposed a possible relationship between the Monster and the genus-0 prop- 
erty, and his method also involves the B3 action given in (2.4.15b). Finally, we argue in 
Section 6.3.3 that the g-action associated with 63 underlies the Galois action in RCFT. 
In all of these examples, the modular group arose from an underlying appearance of the 
braid group 43. Is this the same B3? We suggest that this braid group action (together 


2 The thought that, for example, topological field theory really sees B3 and not SL2(Z) is also made in [404]. 
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with a compatible Virasoro action) somehow underlies Moonshine, and pursuing this 
thought would lead to a second, more conceptual proof of Monstrous Moonshine. 


Question 7.2.1. Verify that any replicable function is uniquely determined by finitely 
many coefficients. 


Question 7.2.2. (a) Verify that J (t) obeys a modular equation for every n = 2,3,4,... 
(b) Suppose To = r + id/s for rational r, s, where s > 0. Use part (a) to prove that J (to) 
is an algebraic number. 


Question 7.2.3. Verify that any replicable function obeys a modular equation. 
Question 7.2.4. Prove that PV, /rad(«|x) is a Lie subalgebra of PV, /PVp. 


Question 7.2.5. For each n, find the modular equations obeyed by the modular fictions 
(a) f(t) =q; O) f= q +40 fa) =" a. 


Question 7.2.6. Arguably, what makes two-dimensional quantum field theory so unique 
is the possibility of braid statistics. Could those braid groups directly be responsible for 
the B3 action of Section 7.2.4? 


Question 7.2.7. Call any VOA Y obeying the hypotheses of Corollary 6.2.5, ‘nice’. Prove 
that a nice VY is holomorphic iff its graded dimension xy(T) is invariant under t œ> —1/T. 
Use this to show, for the class of nice VOAs, that Conjecture 7.2.1 is true iff V” is the 
unique nice VOA with graded dimension J (t). 


7.3 More Monstrous Moonshine 


We give in this section a quick sketch of further developments and conjectures. As we 
know, Moonshine is an area where it is much easier to conjecture than to prove. 


7.3.1 Mini-Moonshine 


It is natural to ask about Moonshine for other groups. Of course any subgroup of M 
automatically inherits Moonshine by restriction, but this isn’t at all interesting. A very 
accessible sporadic is M24 — see, for example, chapters 10 and 11 of [113]. Most con- 
structions of the Leech lattice start with M24, and most constructions of the Monster 
involve the Leech lattice. Thus we are led to the following natural hierarchy of (most) 
sporadics: 


© Mo, (from which we can get M11, M12, M22, M23); which leads to 
e Coo = 2.Co, (from which we get HJ, HS, McL, Suz, Co3, C02); which leads to 
e M (from which we get He, Fin, Fiz, Fi5s,, HN, Th, B). 


It can thus be argued that we could approach problems in Monstrous Moonshine by first 
addressing in order M24 and C01, which should be much simpler. Indeed, Moonshine 
for M24 has been completely established in [153]. 
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Largely by trial and error, Queen [466] established Moonshine for the following groups 
(all essentially centralisers of elements of M): Coo, Th, 3.2.Suz,2.HJ, HN, 2.A,, He, 
M2. In particular, to each element g of these groups, there corresponds a series Q (tT) = 
q7! +. 9 an(g)q", which is a Hauptmodul for some modular group of Moonshine- 
type, and where each g +> a,,(g) is a virtual character. For C00, 3.2.Suz,2.H J and 2.47, 
it is only a virtual character. Other differences with Monstrous Moonshine are that there 
can be a preferred nonzero value for the constant term do, and that although To(N) will 
be a subgroup of the fixing group, it won’t necessarily be normal. 

For example, Queen’s series Qe for Coo is the Hauptmodul (2.2.17a) for the genus-0 
group To(2). Checking the tables in [109], we see that 276, 299, 1771, 2024 and 8855 are 
dimensions of irreducible modules of the Conway group Co, (hence its Z2-extension 
Coo), and 24 is the dimension of the Cop representation associated with the Leech 
lattice (it’s only a projective representation of Co,). We find 11 202 = 8855 + 1771 + 
299 + 276 + 1, and the ambiguity 2048 = 1771 + 276+ 1 = 2024 + 24 is resolved in 
favour of the latter by considering other character values and comparing to the list of 
Hauptmoduls. That a virtual character is needed for Coo is clear from the minus signs in 
(2.2.17a). This Hauptmodul is better known as the McKay—Thompson series Tp (and 
the centraliser of 2B involves Coo, which isn’t a coincidence), but about half of Queen’s 
Hauptmoduls QO, for Coo do not arise as T, for M. Nevertheless, next subsection we see 
how to interpret them through the Moonshine for M. 

The Hauptmodul for I'9(2)+ looks like 


q7! + 43724 + 96256q7 + 124000297 + --- (7.3.1a) 
and we find the relations 


4372 = 4371 + 1, 96256=96255+4 1, 1240002 = 1 139374 + 4371+42-1, 
(7.3.1b) 


where 1, 4371, 96 255 and 1 139 374 are all dimensions of irreducible representations of 
the Baby Monster B. Thus we may expect Moonshine for B. This should actually fall 
into Queen’s scheme because (7.3.1a) is the McKay—Thompson series associated with 
class 2A of M, and the centraliser of an element in 2A is a double cover of 

However, there can’t be a VOA V = ©, V, with graded dimension (7.3.1a) and auto- 
morphism B, because, for example, the B-module V; doesn’t contain Vy as a submodule 
(recall Question 5.2.1). Nevertheless, Höhn deepened the analogy between M and B by 
constructing a vertex operator superalgebra V B’ of central charge c = 23.5, called the 
shorter Moonshine module, closely related to V’ (see e.g. [289]). Like V” it is holo- 
morphic (i.e. it has only one irreducible module), with automorphism group Z2 x B and 
graded dimension 


xve (t) = q “7/8 (1 + 4371q°? + 96256q? + 1143745977 +---). (7.3.2a) 


Of course the strange —47/48 is —c/24; the half-integer powers of q come from the odd 
(i.e. fermionic) part of VB’. Just as M is the automorphism group of the Griess algebra 
Vi, so is B the automorphism group of the algebra (V B’). Just as V” is associated 
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with the Leech lattice A, so is VB’ associated with the shorter Leech lattice O23, 
the unique 23-dimensional positive-definite self-dual lattice with no vectors of length- 
squared 2 or 1 (see chapter 6 of [113]). The automorphism group of O23 is a central 
extension of Coz by Z2. The relation between (7.3.2a) and (7.3.1a) will be clearer next 
subsection. 

Similarly, Duncan [163] constructs a vertex operator superalgebra A? with c = 12 
and automorphism group C0. Again it is holomorphic, and has graded superdimension 


xar(t) = q7"? (1+ 276q — 2048 47? + 112024? — 4915297? +---), (7.3.2b) 


i.e. is given by (2.2.17a) with t +> 1/2 and hence is fixed by a genus-0 subgroup of 
SL2(R) (see Question 7.3.1). It is the unique ‘nice’ holomorphic vertex operator superal- 
gebra with c = 12 and no elements with conformal weight 1/2, in perfect analogy with 
the conjectured uniqueness of V’ (Conjecture 7.2.1). The algebra Af” plays the same 
role for Co; that V” plays for M. In particular, just as V” is obtained from a Z2-orbifold, 
so is Af", and this removes the constant term and enhances the symmetry. From this 
construction of A/®, it is then straightforward (see Theorem 7.1 in [163]) to compute 
explicit finite expressions for the Thompson twists of (7.3.2b) by g € Co, using Frame 
shapes as described in [111]. In this way, a genus-0 Moonshine for Co; is established 
(as expected, the arguments are far simpler than that for M). 

There has been no interesting Moonshine rumoured for the remaining six sporadics 
(the pariahs Jı, J3, Ru, ON, Ly, J4). There is some sort of weaker Moonshine for any 
group that is an automorphism group of a vertex operator algebra (so this means any 
finite group [152]!). Many finite groups of Lie type should arise as automorphism groups 
of VOAs associated with affine algebras except defined over finite fields. But apparently 
the known finite group examples of genus-0 Moonshine are limited to those involved 
with M. 


7.3.2 Twisted #7: Maxi-Moonshine 


In an important announcement [450], on par with [111], Norton unified and generalised 
Queen’s work. Unfortunately he called it ‘Generalised Moonshine’, but we won’t (recall 
the diatribe in Section 3.3.1). 

About a third of the McKay—Thompson series T, will have some negative coefficients. 
In Section 7.3.5 we see that Borcherds interprets them as dimensions of superspaces 
(which automatically come with signs). Norton proposed that, although T,(—1/r) will 
not usually be another McKay—Thompson series, it will always have nonnegative integer 
q-coefficients, and these can be interpreted as ordinary dimensions. In the process, he 
extended the g +> T, assignment to commuting pairs (g, h) € M x M. 


Conjecture 7.3.1 (Norton [450]) To each pair g, h € M, gh = hg, we have a function 
N¢e,ny)(t) such that 


at +b a b 
Negene,ght)(T) =a Ngh) (==) ; Vv ic a) E€ SL2(Z), (7.3.3) 
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for some root of unity æ (of order dividing 24, and depending on g, h,a, b, c, d). Nog, n (T) 

is either constant, or generates the modular functions for a genus-0 subgroup of SL: (R) 
containing some I (M). Constants N; (T) arise when all elements of the form eth? (for 
gcd(a, b) = 1) are ‘non-Fricke’ (defined below). Each Neg n (T) has aq x -expansion for 
that M; the coefficients of this expansion are characters evaluated at h of some central 
extension of the centraliser Cm(e). Simultaneous conjugation of g, h leaves the function 
unchanged: Naga- aha) (T) = Neg, (T). 


We call Neg n) (T) the Norton series. An element g € M is called Fricke if the group T, 
fixing T, contains an element sending 0 to ioo. In terms of the notation of Conjecture 7.1.1, 
g € Mis Fricke iff the invariance group I’, contains the Fricke involutiont œ> —1/(Mr). 
The identity e is Fricke, as are 120 of the 171 I’,. For example, the classes pA, for p 
prime, are Fricke, while the classes pB are not. 

The McKay—Thompson series are recovered by the g = e specialisation: Nge,n)(T) = 
T(t). Unlike the McKay—Thompson series, the Norton series can have cyclotomic 
integer coefficients, and the groups fixing them may not contain lo(M). If g is Fricke, 
then clearly Neg e(t) = T,(t/M). The action (7.3.3) of SL2(Z) is related to its natural 
action on the fundamental group Z? of the torus, as we saw in Section 6.3.1, as well as 
a natural action of the braid group, as we’ll see next subsection. 

For example, when (g, h) = Z x Z and g, h, gh are all in class 2A, then 


Neeg (T) = Neen (T) = Telt) = q7! + 4372q + 962569? +--+, (7.3.4a) 
Nege (T) = Nao lT) = Ty(t/2) = q7? + 4372q"* + 96256 +--+, (7.3.4) 
Nem (t) = VIT) — 984 = q7'/? — 4924"? — 2259097/? +-->, (7.3.4c) 
Neo lT) = Nona (t) = q7? + 4372q'” — 962564 +- . (7.3.4d) 


Hence Neg e(t + 1) = i Nę,g)(T), giving us an example of a nontrivial œ in (7.3.3). 

The basic tool we have for approaching Moonshine conjectures is the theory of VOAs, 
so we need to understand Norton’s suggestion from that point of view. This is done using 
twisted modules (Section 5.3.6). For each g € M, there is a unique g-twisted module of 
V! [150] — call this twisted module V°(g). This generalises the holomorphicity of V” 
mentioned in Section 7.2.1. Given any automorphism h € Aut(V") commuting with g, 
we can perform Thompson’s trick (5.3.23) and write 


q”'tryyghq’ =: Z(g, h; T). (7.3.5) 


Then Z(g, h) = Non). 

[150] proves that, whenever the subgroup (g, h) generated by g and A is cyclic, then 
Ng n) Will be a Hauptmodul satisfying (7.3.3). This will happen, for instance, whenever 
the orders of g and h are coprime. [150] proves this by reducing it to Conjecture 7.1.1 
(which is now a theorem). Extending [150] to all commuting pairs g, h is one of the 
most pressing tasks in Moonshine. 

Hohn [290] verified Conjecture 7.3.1 for g inclass2A andh € Cm(e) = 2.B. In partic- 
ular, those 247 functions N(¢,)(2T) are Hauptmoduls for genus-0 groups of Moonshine- 
type (see Question 7.3.1). The proof mirrors that of [72] fairly closely. There is a simple 
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relation between the twisted module V°(g) and the shorter Moonshine module V B’, 
and from this the 286 Thompson twists of (7.3.2a) can be obtained [290]. Verifying 
Conjecture 7.3.1 for g in class 2B should likewise be possible. 

More satisfying though would be a uniform proof of Conjecture 7.3.1, for example, 
by considering the full orbifold V?/M. It appears that the 3-cocycle œ corresponding to 
this orbifold (recall the cohomological twist of Section 5.3.6) will have to be nontrivial — 
in fact, its order in H>(M, C*) should be a multiple of 12 [408]. Suggestive is that 
the permutation orbifold M®"/(g) gives a natural interpretation of the left-half of the 
definition (7.1.9) of a replicable function. 

The orbifold theory for M 4 is established in [153] (the relevant series Z(g, h) had 
already been constructed in [407]). Next up should be the orbifold theory for Conway’s 
group Coy, but that seems out of reach right now, in spite of [163]. 

As has been alluded to elsewhere in this book, the subfactor approach complements 
that of VOAs. In particular, orbifolds seem more accessible for them [157], [332]. 


7.3.3 Why the Monster? 


That M is associated with modular functions can be explained mathematically by it 
being the automorphism group of the vertex operator algebra V”. But what is so special 
about that group M that these modular functions T, and Ngn) should be Hauptmoduls? 
In fact, every group known to have rich genus-O Moonshine properties is contained 
in the Monster. To what extent can we derive M from Monstrous Moonshine? Our 
understanding of this seemingly central role of M is still poor. 

The most interesting approach to this important question is due to Norton, and was first 
(cryptically) stated in [450]: the Monster is probably the largest (in a sense) group with 
the 6-transposition property. A k-transposition group G is one generated by a conjugacy 
class K of involutions, where the product gh of any two elements of K has order < k. 
For example, take K to be the transpositions in the symmetric group S,,, that is, K is the 
set of all permutations (ij). Since x o (ij)o x7! = (wi, m j), K is a conjugacy class 
in S,. An easy induction on n confirms that S, is generated by K. Moreover, (ij )(k£) 
has order 1, 2, 3, respectively iff the set {i, j} U {k, £} has cardinality 2, 4, 3. Thus Sn 
is a 3-transposition group (this example is the source of the name ‘k-transposition’). 
The Monster M is 6-transposition, for the choice of class K = 2A (see Section 7.3.6 for 
more details). Transposition groups were used in the finite simple group classification 
by Fischer to great effect. The simplest relation known to this author, of the number ‘6° 
to genus 0, is given in Question 7.3.2. 


The group F = PSL2(Z) is isomorphic to the free product Z3 * Z2 generated by an 


order 3 element u = ( 1 3) and an order 2 element v = C Be | A transitive 


action ofT ona finite set X with one distinguished point x9 € X is equivalent to specifying 
a finite index subgroup To of T. In particular, M9 is the stabiliser {g € I | g.xo = xo} of xo, 
X can be identified with the cosets r'o\T and x9 with the coset To. (If we avoid specifying 
Xo, then M9 will be identified only up to conjugation.) As an abstract group, To will be 
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a free product of a certain number of Z’s, Z3’s and Z’s (e.g. F, = Z*x Zx---* Zn 
times). 

To such an action, we can associate a directed graph G: its vertices are labelled by 
the set X, and we draw a solid edge directed from x to u.x, and a dotted undirected 
edge between x and v.x. Choose any spanning tree 7 of G (i.e. a connected subgraph of 
G containing all vertices of G and the minimum possible number (||X || — 1) of edges). 
Then the Reidemeister—Schreier method (see e.g. the appendix to [292] or section I.3 of 
[103]) gives a presentation for To, with one generator for every edge in G not in T. 

We are more interested though in a triangulation of the closed surface ['y\H, called 
a (modular) quilt, which we can canonically associate with the action of I in X. The 
definition, originally due to Norton and further developed by Parker, Conway and Hsu, 
is somewhat involved and will be avoided here (but see especially chapter 3 of [292]). 
It is so-named because there is a polygonal ‘patch’ covering every cusp of To\H, and 
the closed surface is formed by sewing together the patches along their edges (‘seams’). 
There are a total of 2n triangles and n seams in the triangulation, where n is the index 
|o\P || = IX ||. The boundary of each patch has an even number of edges, namely the 
double of the corresponding cusp width. The formula (2.2.16) for the genus g of ro\H 
in terms of the index n and the numbers n; of lo-orbits of fixed points of order i, can be 
interpreted in terms of the data of the quilt (see (6.2.3) of [292]), and we find in particular 
that if every patch of the quilt has at most six sides, then the genus will be 0 or 1, and 
genus 1 only exceptionally. 

The quilt picture was specifically designed for one class of these I -actions (actually an 
SL,(Z)-action, but this doesn’t matter). Fix a finite group G (we’re most interested in the 
choice G = M). Recall from (2.4.15) the right action of B3 on triples (g1, g2, 83) € G?, 
and the equivalent reduced action of B; on G?. We will be interested in this action on 
the subset of G? where all gi € G are involutions. The modular group SL2(Z) is related 
to B; by (1.1.10a). From this, we can get an action of SL? (Z) in two ways: either (i) 
by restricting to commuting pairs g, h; or (ii) by identifying each pair (g, h) with all 
conjugates (aga~!, aha~!). Norton’s SL,(Z) action (7.3.3) arises from the B3 action of 
(2.4.15b), when we combine both (i) and (ii). 

The number of sides in each patch of the corresponding quilt is determined by the 
orders of the g, in these pairs. Taking G to be the Monster, and the involutions 
gi from class 2A, then each patch will have < 6 sides, and the corresponding genus 
will be O (usually) or 1 (exceptionally). In this way we can relate the Monster with a 
genus-O property. This approach to genus 0 faces the same challenge of any other: how 
to incorporate the Atkin—Lehner involutions of Proposition 7.1.2(ii). 

Based on the $3 actions (2.4.15), Norton hopes for some analogue of Moonshine valid 
for noncommuting pairs. Although the resulting series are always modular, they may 
not be Hauptmoduls, their fixing group may not contain some F(N), and the coefficients 
won’t always be cyclotomic integers. CFT considerations (“higher-genus orbifolds’) 
alluded to in Section 6.3.1 suggest that this might be more natural to do using, for example, 
noncommuting quadruples (g1, 82, h1, h2) € Mt obeying gilig; hy! = hogohy'g5'; 
the role of SL2(Z) is then played by higher-genus mapping class groups. 
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An important question is, how much does Monstrous Moonshine determine the Mon- 
ster? How much of M’s structure can be deduced from, for example, McKay’s E g Dynkin 
diagram observation (Section 7.3.6), and/or the (complete) replicability of the T,, and/or 
Conjecture 7.3.1, and/or Modular Moonshine in Section 7.3.5 below? A small start 
towards this is taken in [452], where some control on the subgroups of M isomorphic to 
Zp X Zp (p prime) is obtained, using only the properties of the Ngga). See also chapter 8 
of [292]. 


7.3.4 Genus 0 revisited 


Tuite [532] suggests a very intriguing reformulation of the genus-O property, directly 
in terms of VOAs. Assume the uniqueness conjecture: V” is the only c = 24 VOA 
with graded dimension J (Section 7.2.1). He argues from this that, for each g € M, the 
McKay-—Thompson series T, will be a Hauptmodul iff the only orbifolds of V’ are the 
Leech lattice VOA V(A) and V” itself. More precisely, orbifolding V” by (g) should be 
Vt if g is Fricke, and V(A) if g is non-Fricke (‘Fricke’ is defined in Section 7.3.2). 

In, for example, [313], this analysis is extended to the genus-0 property of some Norton 
series Ng, a), when the subgroup (g, h) is not cyclic (thus going beyond [150]), although 
again assuming the uniqueness conjecture. Tuite is thus suggesting that the genus-0 
property of the Monstrous Moonshine functions T, and Ng n) seems to be equivalent 
to a single principle. These arguments emphasise the importance of establishing the 
uniqueness conjecture of V”. Unfortunately, that still seems out of reach. 


7.3.5 Modular Moonshine 


Consider an element g € M. We know from [466], [450], [150] that there is a Moon- 
shine for the centraliser Cm(g) of g in M, governed by the g-twisted module V4(g). 
Unfortunately, V°(g) is not usually itself a VOA, so the analogy with M is not perfect. 
Ryba found it interesting that, for g € M of prime order p, Norton’s series Nig,n) can be 
transformed into a McKay—Thompson series (and has all the associated nice properties) 
whenever h is p-regular (i.e. h has order coprime to p)—as we know, in this case (g, h} is 
cyclic. This special behaviour of p-regular elements suggested to him to look at modular 
representations, for reasons we’ll soon see. 

Let’s begin by reviewing the basics of modular representations and Brauer characters 
(see also [446], [308]). A modular representation p of a group G is a representation 
defined over a field of positive characteristic p dividing the order ||G|| of G. This is 
precisely the class of finite-dimensional representations where the usual properties break 
down. Such representations possess many special (that is to say, unpleasant) features. 

For one thing, they are no longer completely reducible, so Theorem 1.1.2 breaks down. 
For a simple example, let p be any prime and consider G = Z,; then over any field of 


characteristic p, the map 
ar k À (1.3.6) 
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defines a two-dimensional representation of G that is indecomposable but not irre- 
ducible. It’s not irreducible because it maps the x-axis to itself, and so contains the 
one-dimensional identity representation as a subrepresentation. Before, given a repre- 
sentation we could simplify it enough merely by writing it as a direct sum of indecom- 
posables, but here there are far too many indecomposables. In other words, there are 
other more complicated ways to combine irreducibles than direct sum. The familiar role 
of irreducibles as direct summands is replaced here by their role as composition factors. 
It is completely analogous to, and simpler than, the role of simple groups in finite group 
theory (recall Section 1.1.2). Completely reducible representations (as in Theorem 1.1.2) 
are equivalent to a representation with blocks down the diagonal and zero-blocks above 
and below the diagonal; the diagonal blocks are its irreducible summands. On the other 
hand, a modular representation p is equivalent to a matrix with zero-blocks below the 
diagonal; the blocks along the diagonal (e.g. two copies of the trivial representation (1) 
for the representation in (7.3.6)) are the composition factors, and the blocks above the 
diagonal describe how these glue together. 

Another complication is that the familiar character x, of (1.1.5) loses its usefulness. 
As we saw at the end of Section 1.1.3, very different modular representations can have 
identical characters. Instead, the more subtle Brauer character B(p) is used. It can be 
defined as follows. Let m be the order ||G|| of G, and write m = p“p’ where p and 
p’ are coprime. Let K be the cyclotomic field Q[€,,], and let R = Z[&,] be the ring 
of cyclotomic integers. A finite field k of characteristic p can be obtained from R 
by choosing any prime ideal p of R containing pR; then k = R/p. This construction 
of k defines a ring homomorphism ¢, : R — k. In particular, put € := Ep» € R; then 
E= Pp(E) will be a primitive p'th root of unity in k. 

Suppose p is some n-dimensional modular representation of G over k. Let G y be the 
set of all p-regular elements in G. The field k defined above is big enough that the n x n 
matrix p(g), for any g € G y, is diagonalisable over k. More precisely, its n eigenvalues 
(counting multiplicities) are all p’th roots of unity in k, and so can be written as g“ for 
some integers 4;, 1 <i <n. 

The Brauer character (p) of p is defined to be 


n 
Blog) = DEV ERCC,  YgeEGyp. 
i=l 
It is a well-defined class-function on G ,,, and in fact the Brauer characters form a basis for 
the space of class functions on G y. Two representations have the same Brauer character 
iff they have the same composition factors. Brauer characters were introduced by Brauer 
and his student Nesbitt in 1937. Apart from their role in modular representations, they also 
relate p-subgroups of G with properties of the usual character table. See Question 7.3.4 
for an example. 


Theorem 7.3.2 ((484], [79], [77]) Let g € M be any element of prime order p, for any 
p dividing ||M||. Then there is a vertex operator superalgebra £V = nez! Vn defined 
over the finite field F , and carrying a (projective) representation of the centraliser Cyy(g). 
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Ifh € Cy(g) is p-regular, then the graded Brauer character 
R(g, h; 1) = q7! $ BEV Mh) q" 


neZ 
equals the McKay-Thompson series T a(t). Moreover, for g belonging to any conjugacy 
class in M except 2B, 3B, 5B, 7B or 13B, this is in fact an ordinary VOA (i.e. the ‘odd’ 
part vanishes), while in those remaining cases the graded Brauer characters of both the 
odd and even parts can be expressed separately using McKay-Thompson series. 


We defined vertex operator superalgebras in Section 5.1.3. The centralisers Cm(g) in 
the theorem are quite nice: for example, for groups of type 2A, 2B, 3A, 3B, 3C, 5A, 
5B, 7A, 11A these are extensions of the sporadic groups B, Co, Fi5,, Suz, Th, HN, 
HJ, He and Mr, respectively. The proof for p = 2 is not complete as it relies on a 


still-unproven hypothesis. The conjectures in [484] concerning modular analogues of 
the Griess algebra for several sporadic groups follow from Theorem 7.3.2. 

Can these modular *Y’s be interpreted as a reduction mod p of (super)algebras in 
characteristic 0? What can we say about elements g of composite order in M? 


Conjecture 7.3.3 (Borcherds [77]) Choose any g él M and let n denote its order. 
Then there is a iZ- -graded superspace p = Bie! igh, over the ring of cyclotomic 
integers Zle2ri/ny, It is often (but probably not always) a vertex operator superalge- 
bra — in particular, 'V is an integral form of the Moonshine module V". Each 8Y 
carries a representation of a central extension of Cu(g) by Zn. Define the graded 
trace 


Big, hit) =q Yo chg h)g. 
ieiz 
If g,h € M commute and have coprime orders, then B(g,h;t) = T,;(t). If all q- 
coefficients of T, are nonnegative, then the ‘odd’ part of V vanishes, so it is an ordinary 
space, and should equal the g- -twisted module V*(g) of [150]. If g has prime order 


p, then the reduction mod p of 8Y is the modular vertex operator superalgebra £V of 
Theorem 7.3.2. 


More precisely, € V is to be a free module over the ring Z[e?"'/"], and each graded piece 
is finite-dimensional over that ring. When we say 1) is an integral form for V’, we 
mean that ') has the same structure as a VOA, with everything defined over Z, and 
tensoring it with C gives V’. Borcherds’ conjecture, which beautifully tries to explain 
Theorem 7.3.2, is completely open. It provides the analogue for V° of the surprising Lie 
algebra Theorems 1.5.4 and 3.4.1. 


7.3.6 McKay on Dynkin diagrams 
McKay found other relationships with Lie theory [411], [75], [247], reminiscent of his 
A-D-E correspondence with finite subgroups of SU2(C) (see Section 2.5.2). As we 
see from Table 7.2, M has two conjugacy classes of involutions. Let K be the smaller 
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one, called ‘2A’ in [109] (the alternative, class ‘2B’, has almost 100 million times more 
elements). The product of any two elements of K will lie in one of nine conjugacy classes: 
namely, 1A, 2A, 2B, 3A, 3C, 4A, 4B, 5A, 6A. These conjugacy classes are of elements of 
orders 1, 2, 2, 3, 3, 4, 4, 5, 6. It is remarkable that, for such a complicated group as M, that 
list stops at only 6 — as we know from Section 7.3.3, we call M a 6-transposition group 
for this reason. The punchline: McKay noticed that those nine numbers are precisely the 
labels a; of the affine Eg diagram (see Figure 3.2). Thus we can attach a conjugacy class 
of M to each vertex of the Eg" diagram. A direct interpretation of the edges in the Es‘ 
diagram, in terms of M, is unfortunately not yet known, though [247], [365] establish 
how to unambiguously assign classes to the nodes. 

We can’t get the affine E7 labels in a similar way, but McKay noticed that an order 2 
folding of affine E7 gives the affine F4 diagram, and we can obtain its labels using the 
Baby Monster B (the second largest sporadic). In particular, let K now be the smallest 
conjugacy class of involutions in B (also labelled ‘2A’ in [109]); the conjugacy classes 
in K K have orders 1, 2, 2, 3, 4 (B is a 4-transposition group) — these are the labels of 
F,. Of course we’d prefer E7‘” to F4‘”, but perhaps that two-folding has something 
to do with the fact that an order-2 central extension of B is the centraliser of an element 
g € Mof order 2. 

Now, the triple-folding of affine E6 is affine G2. The Monster has three conjugacy 
classes of order 3. The smallest of these (‘3A’) has a centraliser that is a triple cover of 
the Fischer group F'i},.2. Taking the smallest conjugacy class of involutions in Fi},.2, 
and multiplying it by itself, gives conjugacy classes with orders 1, 2, 3 (Fi5,.2 is a 
3-transposition group) — and those not surprisingly are the labels of G2”! 

McKay’s Eg, Fx, Go observations still have no explanation. In [247] these pat- 
terns are extended, by relating various simple groups to the Eg‘ diagram with deleted 
nodes. More recently, [365] relate the Eg‘) observation to VOAs, by applying [425] to 
the lattice VOA V(./2Es); the connection with V* is plausible but not yet completely 
established. As we know from Section 1.5.4, the folding of Coxeter-Dynkin diagrams 
arises when we restrict to the invariant subalgebras of automorphisms, so perhaps that 
provides a clue how to attack the F,© and G,” observations. 


7.3.7 Hirzebruch’s prize question 


Algebra is the mathematics of structure, and so of course it has a profound relationship 
with every area of mathematics. Therefore the trick for finding possible fingerprints of 
Moonshine in, say, geometry is to look there for modular functions. And that search 
quickly leads to the elliptic genus. 

We briefly discuss this in Section 5.4.2, where we mention several deep relationships 
between elliptic genera and the material covered elsewhere in this book. Let us simply 
mention here that the genus of a manifold will typically involve negative coefficients 
and be the graded dimension of a vertex operator superalgebra. This certainly doesn’t 
preclude Moonshine-like behaviour — for example, Moonshine for Co; involves as we 
know the vertex operator superalgebra A/*. However, the genera of even-dimensional 
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projective spaces has nonnegative integer coefficients [400]; it would be interesting to 
study the representation-theoretic questions associated with them. 

Hirzebruch’s ‘prize question’ (page 86 of [287]) asks for the construction of a 24- 
dimensional manifold M with Witten- or A- genus J (after being normalised by n+). We 
would like M to act on M by diffeomorphisms, and the twisted Witten genera to be the 
McKay-Thompson series T,. See also [151]. It would also be nice to associate Norton’s 
series Ngga) with this Moonshine manifold. Constructing such a manifold would realise 
the geometry underlying Monstrous Moonshine, and as such is perhaps the remaining 
Holy Grail in the subject. 

Hirzebruch’s question was partially answered by Mahowald—Hopkins [399], who 
constructed a manifold with Witten genus J, but couldn’t show it would support an 
effective action of M. Related work is [21], who constructed several actions of M on, for 
example, 24-dimensional manifolds (but none of which could have genus J), and [364], 
who showed the graded dimensions of the subspaces V{ of the Moonshine module are 


twisted A-genera of Milnor—Kervaire’s manifold Mè (the A-genus is the specialisation 
of elliptic genus to the cusp ioc). 

Related to elliptic genus is elliptic cohomology, which is described beautifully in 
[499]. Mason’s constructions [407] associated with Moonshine for the Mathieu group 
Moa have been interpreted as providing a geometric model (‘elliptic system’) for elliptic 
cohomology Ell*(B M24) of the classifying space of M24 [523], [154]. 


7.3.8 Mirror Moonshine 


There has been a second conjectured relationship between geometry and Monstrous 
Moonshine. Calabi-Yau manifolds (see e.g. [299]) are a class of complex manifolds 
with an unusually rich mathematical structure — for example, in dimensions 1 and 2 they 
are elliptic curves and K3 surfaces, respectively. Specifying a Calabi-Yau manifold X 
means choosing a complex structure, as well as a Kahler class [œ] € H 2(X, C). In the 
case of an elliptic curve (i.e. a torus), this corresponds to choosing parameters t, o € H. 
Mirror symmetry [291] says that most Calabi-Yau manifolds come in closely related 
pairs, where the roles of the complex structure and Kähler structure are switched. In 
the case of elliptic curves, it relates the pair (t,o) to the pair (o, t) and implies the 
modularity of certain generating functions for Gromov—Witten invariants — see [132] for 
a review. This unexpected modularity is, of course, reminiscent of Moonshine, and it is 
tempting to look for a concrete connection. 

Consider a one-parameter family X, of Calabi-Yau manifolds, with mirror X* given 
by the resolution of an orbifold X/G for G finite and abelian. Then the Hodge numbers 
h!(X) and h?:!(X*) will be equal, and more precisely the moduli space of (complexified) 
Kahler structures on X will be locally isometric to the moduli space of complex structures 
on X*. The ‘mirror map’ A(qg), which can be defined using the Picard—Fuchs equation 
[438], is a canonical map between those moduli spaces. For example, ae + a + x3 + 
x + àT l4x1x2x3x4 = Ois sucha family of K3 surfaces, where G = Z4 x Z4. Its mirror 
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map is given by 


A(q) = q — 104q? + 6444q° — 311 744q* + 13 018 830q5 — 493 025 760q° +--- . 
(7.3.7) 


Lian—Yau [385] noticed that the reciprocal | /4(q) of the mirror map in (7.3.7) equals 
the McKay—Thompson series T24(T) + 104. After looking at several other examples with 
similar conclusions, they proposed their Mirror Moonshine Conjecture: The reciprocal 
1/X of the mirror map of a one-parameter family of K3 surfaces with an orbifold mirror 
will be a McKay—Thompson series (up to an additive constant). 

A counterexample (and more examples) are given in section 7 of [544]. In particular, 
although there are relations between mirror symmetry and modular functions (see e.g. 
[266] and [275]), there doesn’t seem to be any special relation with M. Doran [158] 
‘demystifies the Mirror Moonshine phenomenon’ by finding necessary and sufficient 
conditions for 1/4 to be a modular function for a modular group commensurable with 
SL2(Z). 

This focus on K3 surfaces is not significant. Calabi-Yau 3-folds are the real meat of 
mirror symmetry, but it is much harder to find explicit families. Some of the interesting 
number theory of Calabi—-Yau manifolds and mirror symmetry is reviewed in [571]. 


7.3.9 Physics and Moonshine 


The physical side of Moonshine (namely, perturbative string theory and conformal field 
theory) was noticed early on, and has profoundly influenced the development of Moon- 
shine and VOAs. This effectiveness of physical interpretations isn’t magic — it merely tells 
us that finite-dimensional objects are sometimes seen much more clearly when studied 
through infinite-dimensional structures (often by being “looped’). Of course Monstrous 
Moonshine, which teaches us to study the finite group M via its infinite-dimensional 
module V+, fits perfectly into this picture. 

Throughout this book we’ve described various points-of-contact between mathemat- 
ics and physics. Because V’ is so mathematically special, it may be expected that it 
corresponds somehow to interesting physics. Although there have been some attempts 
to directly interpret Monstrous Moonshine in the context of physics, we still have no 
evidence Nature concurs. 

There is ac = 24 RCFT whose anti-holomorphic chiral algebra is trivial, and whose 
holomorphic one, as well as the state space H, are both V’ (this is possible because 
V’? is holomorphic). This RCFT is nicely described in [142]; its symmetry is the 
Monster. The Bimonster M2 Z2 = (M x M)™xZp» (Section 7.1.1) is the symmetry of 
ac = T = 24RCFT with state space H = V! Q V", The paper [119] finds the D-branes 
(boundary states) of lowest mass for this theory; they are in one-to-one correspondence 
g +> ||g)) with the elements of M. The Bimonster permutes them: (A, k).||g)) = ||gk7')), 
while the remaining involution sends ||g)} to ||g~!)). Most interestingly, their ‘overlaps’ 
((g||q2£o+£o- = ||h)) equal the McKay—Thompson series T,-1,. We largely ignored D- 
branes (surfaces on which endpoints of open strings rest) in Chapter 4, but they are a 
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natural ingredient in string theory. Much as every natural property of the Wess—Zumino— 
Witten string translates nicely into Lie theory, it would appear that the same holds with 
the string theory H = V’ @ V? and the Monster M. Surely it would be interesting to 
continue that investigation. Other suggestions for the physics of Monstrous Moonshine 
are [99], [274], [96], [260], [281]. 


Question 7.3.1. Let f(t) be a Hauptmodul for some genus-0 group I’. For any a > 0, 
prove that f(ar) is fixed by a genus-0 group (call it r4), and any modular function for 
Ta will be a rational function in f (art). 


Question 7.3.2. Let G be any group with exponent k < 6 (i.e. g% = e for all g € G). 
Suppose there are a set of functions Nig ,)(t) associated with every commuting pair 
g, h € G, with the property that equation (7.3.3) always holds with œ = 1. Prove that 
each of these functions is fixed by a genus-0 subgroup of SL2(Z). 


Question 7.3.3. Assume for simplicity that g € M is such that Cm(8) acts linearly (i.e. 
nonprojectively) on the twisted module V°(g). Then for h € Cm(g) of order n, the q- 
coefficients of Z(g, h) all lie in the field Q[é,]. Fix any Galois automorphism o € 
Gal(Q[é,]/Q), and let o Z(g, h) denote the g-expansion obtained by formally applying o 
term-by-term to Z(g, h): o ©; aiqi) = X; o(ai)q'. Show that o Z(g, h) equals another 
series Z(g’, h’), for some g’ € M, h’ € Cy(g’). 


Question 7.3.4. Consider the usual representation p of G = S3 by 3 x 3 permutation 
matrices, associating with 2 € S3 the matrix p(s) obtained from the identity matrix by 
applying 7 to the components of each column. For example, 


00 1 
o(123)=11 0 0 
010 


Show that p is completely reducible when considered as a modular representation over 
characteristic 2, but is not completely reducible when considered as a modular represen- 
tation over characteristic 3. For both characteristic 2 and 3, compute its Brauer character 
using the definition given in Section 7.3.5. 


Epilogue, or the squirrel who got away? 


So, has Monstrous Moonshine been explained? According to most of the fathers of the 
subject, it hasn’t. They consider VOAs in general, and V° in particular, to be too com- 
plicated to be God-given. The progress, though impressive, has broadened not lessened 
the fundamental mystery, they would argue. 

For what it’s worth, I don’t completely agree. Explaining away a mystery is a little 
like grasping a bar of soap in a bathtub, or quenching a child’s curiosity. Only extreme 
measures like pulling the plug, or growing up, ever really work. True progress means 
displacing the mystery, usually from the particular to the general. Why is the sky blue? 
Because of how light scatters in gases. Why are Hauptmoduls attached to each g € M? 
Because of V’. Mystery exists wherever we can ask ‘why’ — like beauty, it’s in the 
beholder’s eye. 

Understanding doesn’t put an end to questions, it spices them. There’s always a hori- 
zon, no matter how high you climb, beyond which everything is still hidden. 

However, have we really isolated the key conjunction of properties needed for Moon- 
shine to arise? Can we derive the Monster from Monstrous Moonshine? In Section 7.2.4 
we make the case for a more direct, topological explanation for Moonshine involving 
compatible actions of the Virasoro algebra and the braid group 53. In any case, we need 
a second independent proof of Monstrous Moonshine. 

Moonshine is now ‘leaving the nest’. We are entering a consolidation phase, tidying 
up, generalising, simplifying, clarifying, working out more examples, climbing a few 
metres higher. Important and interesting discoveries will be made in the next few years, 
and yes, there still is mystery, but no longer does a Moonshiner feel like an illicit distiller: 
Moonshine is now a day-job! 


Question and Answer in the Mountains! 


They ask me why I live in the green mountains. 
I smile and don’t reply; my heart’s at ease. 
Peach blossoms flow downstream, leaving no trace — 
And there are other earths and skies than these. 
Li Bai 701 AD 


! Translated by Vikram Seth, Three Chinese Poets (London, Faber and Faber, 1992). 
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Rez, Imz 
Lx], [x] 

(a, u) 

(u|v) =u-v 
M' 

Mi 

IIS |] 

Z 

gcd(a, b) 

Bn 


Notation 


Meaning 


real, imaginary parts of z € C 
largest (smallest) integer < x (> x) 
Hermitian form 

inner-product 

transpose of matrix M 
matrix-adjoint = M' 

cardinality of a set, order of a group 
complex conjugation 

greatest common divisor 

the braid group on n strands (1.1.9) 
the complex numbers 

the multiplicative group of nonzero z € C 
conformal field theory 

Kronecker delta: 1 if i = j, otherwise 0 
Dirac delta distribution 

Dedekind eta function (2.2.6b) 

the upper half-plane (0.1.1) 

the upper half-plane with cusps (0.1.3) 
n x n identity matrix 

Hauptmoduls for SL2(Z) (0.1.8) 
Leech lattice (Section 1.2.1) 
Monster finite simple group 

the nonnegative integers {0, 1,2, ...} 
Riemann sphere C U {oo} 

ee reH 

the rational numbers 

the real numbers 

rational conformal field theory 
Riemann surface, complex curve 

n x n det= 1 matrices, entries in R 
point in H 

Jacobi theta function (2.2.6a) 
McKay-Thompson series (0.3.3) 
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ve Moonshine module (Section 7.2.1) 
VOA vertex operator algebra (Definition 5.1.3) 
En root of unity exp(27ri/n) 

Z the integers 

Section 1.1 

e the identity in a group 

GH groups G and H are isomorphic 
Zn ring and additive group Z/nZ 

F, finite field with q elements 

NAG N normal subgroup of G 

H<G H subgroup of G 

Fn the free group on n generators 

(21, --+5 Zn) the group generated by elements g; 
Dn dihedral group (1.1.1) 

NxH direct product 

NxH semi-direct product 

Sn symmetric group 

Z(G) centre of G 

An alternating group 

Mii, ..., Ma4 Mathieu sporadics 

GL, (K) invertible matrices over field K 
K, conjugacy class {hgh7'} 

CG group algebra 

Pr pure braid group 

C[w, w7!] = C[w*] Laurent polynomials 

[G, G] commutator subgroup (ghg~'h-!) 
Section 1.2 

L a lattice (Section 1.2.1) 

L* the dual of a lattice (Section 1.2.1) 
IIm.n indefinite even self-dual lattice 

ILI determinant of lattice 

Li @® L orthogonal direct sum of lattices 
S” the n-sphere 

(Oi smooth; all partials are continuous 
c@U) C®-functions f : U > R 

T,(M) tangent space at p € M 

TM tangent bundle 

Vect(M) vector fields 

d (M) differential 1-forms 

T*M cotangent bundle 


m\(M, v) = 7(M) fundamental group 
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R 

M,(C) 

type In, I, Too, Ma 
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g 

[xy] 

Heis 

SO; R) 

G 

gl, 

[gb] 

Z(g) 

ad x 

K(x|y) 

g(A) 

A,, B,, C,, D, 
E¢, Ey, Eg, F4, Go 
Mitt 
Diff(M), Difft(M) 
Section 1.5 
LA) 

P+(g) 

M(A) 

b 

ae® 

Dox 

(a|B) 


Fa 
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projective n-space 
configuration space (1.2.6) 


complex separable Hilbert space 

Hilbert space of square-summable sequences 
smooth functions with compact support 
Schwartz space 

Hilbert space of square-integrable functions 
Lebesgue measure 

adjoint of operator T 

direct integral of Hilbert spaces 

bounded operators on H 

commutant of set S 

n x n matrices over C 

families of factors 

crossed-product (1.3.4) 


a Lie algebra 

bracket (multiplication) in Lie algebra 
Heisenberg algebra (1.4.3) 

Lorentz group 

universal cover of G 

Lie algebra of n x n matrices 

span [xy], x Eg, y €H 

centre of g 

adjoint operator (ad x)(y) = [xy] 
Killing form on g 

Lie algebra associated with Cartan matrix A 


Lie algebras sl, )(C), 502-41 (C), 5p2,.(C), 502- (0) 


exceptional simple Lie algebras 
Witt algebra (1.4.9) 
(orientation-preserving) diffeomorphism group 


irreducible module with highest weight à 
dominant integral weights 

Verma module with highest weight À 

a Cartan subalgebra 

roots 

root-space (1.5.5b) 

Killing form on h* 

Weyl reflection (1.5.5c) 

Weyl group 


di; EA 
Wj 


B, u E€ Qp) 


z,h 

y € Aut(g) 
B(H) 

G 
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Vect 

Riem 
Braid 
AUvVWw 

Cuv 
Ribbon 
Ribbons 
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K,L 

Kii, ..., an] 
[L : K] 
Gal(L/K) 
QIé,] 
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K(S) 

p(z) 

O, s(T, Z) 
H*(S), MECS) 


Notation 


simple roots in a base 

fundamental weights 

weights of representation p 
weight-spaces (1.5.6b) in module V 
universal enveloping algebra 

character (1.5.9a) of module V 
character of L(A) 

representation; also, the Weyl vector ye Wj 
y -twisted character 

elements in 

automorphisms of g 

bounded operators with bounded inverse 
unitary dual of Lie group G 


set of arrows = morphisms 
category of vector spaces 

category of Riemann surfaces 
category of braids 

associativity constraint 
commutativity constraint 

category of ribbons 

category of ribbons labelled from S 


fields 

field of polynomials in (algebraic) a; 
degree of field extension K C L 
Galois group 

cyclotomic field 


field of meromorphic functions 

Weierstrass function (2.1.6a) 

theta functions (2.1.7a) 
holomorphic/meromorphic k-differentials 
moduli space for genus g, n punctures 
mapping class group for genus g, n punctures 
Jacobian variety 

Deligne—Mumford compactification 
enhanced moduli space 

enhanced mapping class group 


Riemann zeta function (2.2.3c) 
the principal congruence subgroup (2.2.4a) 
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k 

ô 

Pt 

Xr 
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Notation 


a congruence subgroup (2.2.4b) 
(2.2.5) 
lattice theta function (2.2.1 1a), (2.3.7) 


Gamma function 
Jacobi theta functions 


p-adic integers 
projective limit 
metaplectic group (2.4.9) 


the An, Dn, E6, E7, Eg meta-pattern 
condition on graphs 


Witt algebra (1.4.9) 
standard basis for Witt 
Virasoro algebra (3.1.5) 
standard basis of Yir 


Hamilton operator, gives grading on Yir-modules 


central charge, conformal weight 
Verma module 

irreducible module 

c, h for discrete series (3.1.6) 
the character of V (c, h) 
diffeomorphism group of S! 


finite-dimensional semi-simple Lie algebra 
polynomial loop algebra S! > g 
nontwisted affine algebra 

labels, co-labels (Figure 3.2) 

Cartan subalgebras of g and g 

twisted affine algebra 

Verma module with highest weight À 
irreducible module with highest weight à 
level 

imaginary root 

integrable level k highest weights (3.2.8) 
character of L(A) 

central charge, conformal weight (3.2.9) 
weights 


h“ 
KZ 
LG 
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Gr 

gy,P 
Section 3.4 
xr 
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Section 4.1 


Ya, t) 


|in), |out) 
(outļin) 

ox), P(X), WO) 
pt 

QED 


Section 4.3 
OPE 
gen) 

y 

T(z) 
WZW 

M e (V) 


Notation 


dual Coxeter number 


Knizhnik—Zamolodchikov (Section 3.2.4) 


loop group (Section 3.2.6) 


g, b extended by derivations 


toroidal algebra associated with 2-cocycle t 


Krichever—Novikov algebra 


a-twisted character 


Monster Lie algebra (Sections 3.3.2, 7.2.2) 


Lagrangian 
Hamiltonian 
momentum components 
speed of light 
Lagrangian density 


wave-function 

Planck’s constant 

state-space (Hilbert space) 
Fock space 

annihilation, creation operators 
Hamiltonian operator 
vacuum 

state 

incoming, outgoing states 
transition amplitude 
quantum fields 
energy—momentum operators 
quantum electrodynamics 


operator product expansion 
space of chiral blocks 

chiral algebra (VOA) 
stress—energy tensor 
Wess—Zumino—Witten model 
irreducible V-module 

graded dimension (4.3.8a) 
1-loop partition function (4.3.8b) 
h-twisted sector 
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ype 
Z(g,n(T) 
Xho (T) 
Section 4.4 
Cin 

Cek 

Vect; 


Section 5.1 
W [z+!] 
w[iz*] 


xXv(T) 
(x|*) 
Section 5.2 
ou) 

PVn 
Aut(V) 
V(C”) 


Section 5.3 
M 

Yuu, z) 
Ma 

Me 

(V) 

M [xX] N 
NN 
h(M) 
Maxn 
A(V) 
Xu(T, v) 
x4 (T, u, v) 


Notation 


fixed-point subalgebra for group G 
(4.3.14b) 
(4.3.15c) 


disjoint union of m circles 
connected component in Segal’s Hom(Cm, Cn) 
category of finite-dimensional spaces 


Laurent polynomials in z, coefficients in W 
formal power series in z 
multiplicative Dirac delta (5.1.2) 
residue (5.1.3a) 

a VOA 

vertex operator 

space of conformal weight n vectors 
mode of u € V 

conformal vector 

vacuum vector in V 

graded dimension (5.1.10b) 
invariant bilinear form (5.1.11) 


the ‘zero-mode’ u(,_1), for u € Vn 
conformal primaries (5.2.3) 
automorphism group of V 
Heisenberg VOA 

affine algebra VOA 

lattice VOA 

G-fixed points in V 


V -module 

vertex operator for M 
conformal weight œ space 
contragredient (= dual) module 
irreducible V-modules 
fusion 

fusion multiplicities (5.3.3) 
conformal weight of M 

the algebra of n x n matrices 
Zhu’s algebra 

character of M (5.3.13) 
Jacobi character (5.3.14) 


L{n] 

Vin] 
Inn(V) 
Z(M, h;t) 


Section 5.4 
X 
MSVyx 
MSVyx(X) 
T(X) 
Section 6.1 
ab 
S = (Sap) 
(g,m+n) by,..., bn 


Gidh 


Section 6.2 
Gr(m, n) 
L(x) 
RVOA 
U9) 
R(q) 
Ca(g) 
NCM 
[M:N] 
MXN 


Section 6.3 


Section 7.1 


Fij 
M? Z2 
I, 
o(g) 


Notation 


a second Virasoro action on VY 
homogeneous spaces for L[0] 
inner-automorphisms of V 
h-twisted graded trace (5.3.23) 


smooth complex variety 
chiral de Rham complex 
global sections 
Tamanoi’s invariant 


fusion multiplicities 

modular matrix 

Verlinde dimensions (6.1.2) 
diagonal matrix in modular data 
modular invariant 

intertwining operator 


space of intertwining operators of type Ce) 


Grassmannian 

Roger’s dilogarithm 

rational vertex operator algebra 
quantum group 

family of solutions to (6.2.8) 
centraliser of g inG 

subfactor 

Jones index 

M — N bimodule 


Fermat curve x” + y” +z" =0 
Jacobian of Fermat curve 
absolute Galois group of Q 
algebraic closure of Q 
cyclotomic character 

profinite completion of G 


Baby Monster 

a Fischer group 
Bimonster 

fixing group of T, 
order of g 
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Po(p)+ (7.1.5) 

kiN% any prime dividing k divides N 
Section 7.3 

Af! vertex operator superalgebra for Co; 
VB Baby Monster Moonshine module 
Ngn (T) Norton series 

B(g) Brauer character 

B(g,h;T) Modular Moonshine series 
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harmonic oscillator 228, 230-233, 238, 246, 
248-249, 255-257, 276 
Hauptmodul 
classification 139-140, 408 
definition 5, 139 
examples 6, 139 
and modular equations 418 
and replicable functions 411 
(see also j-function, McKay—Thompson series; 
replicable functions) 
heat equation 147 
heat kernel 148—149 
and KZ equation 11, 150 
and modularity 11, 104, 148-150 
Hecke operator 222, 410 
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Heisenberg algebra Heis 54, 179-180, 194-195, 
212, 214, 216, 265, 326 
Heisenberg group 85-86, 104, 155, 159-164, 179 
Heisenberg VOA (see vertex operator algebra, 
Heisenberg) 
Hermitian form 22, 25, 45, 67, 282, 300 
hexagon axiom 90-91, 398-399 
highest weight (see weight) 
Hilbert space 
definition 45 
in quantum field theory 177, 253, 272, 275 
in quantum mechanics 177, 241—242, 248, 249 
separable 46 
Hilbert’s Problems 56 
Hirzebruch’s ‘prize question’ 431—432 
holomorphic function 200, 266, 274, 281-282 
homeomorphism 33, 41, 100, 110, 306 
homogeneous coordinates (projective space) 39, 
108, 112 
homogeneous extension (see direct product) 
homogeneous space G/K 84, 154—155, 184 
Hopf algebra 264, 271, 379-380, 382, 391 
co-commutative 380 
quasi-triangularisable 380, 385 
quasi-triangularisable quasi- 398—399 
Hopf link 385-386, 400 
hyperbolic 
geometry 105—108, 116, 152 
plane 105-106 
reflections 406 
surface 108, 110 


ideal 53, 59-61, 74-75, 98, 102, 429 
idéle 157-158 
index 
of subfactor 100, 386-387 
of subgroup 15 
injective limit 157—158 
inner-product 29, 37—38, 61, 69-70, 238 
instanton 204—206, 265 
intertwining operator 80, 279, 286-287, 331, 342, 
363-364, 375-376, 389-390 
inverse Galois problem 395 
inverse limit (see projective limit) 
Ising model 288-290, 366-367 
isogeny 393-394 
isometry 16, 107 
isomorphic groups 15 


j-function j(t) 3—4, 6-7, 9, 124, 138, 147, 280, 
288, 419 

J(t) = j(t) — 744 222, 224, 294, 407, 409-410, 
415, 417 

Jacobi form/function 143-145, 153, 155, 176, 196, 
199, 224, 338, 348 

Jacobi identity 53, 320, 330, 350 
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Jacobi triple product identity 221 

Jacobian variety 123, 393-394, 401 

Jones index (see index, of subfactor) 

Jones polynomial 42-44, 94, 305, 307, 388 
Jordan—Hdélder Theorem 18—19, 60 


K3 surface 205, 432-433 
K-theory 169 
and character rings 370 
and fusion rings 370-371 
of Z 169 
Kac—Moody algebra 
and affine algebras 176, 210-212 
basic theory 209-211 
hyperbolic (Lorentzian) 210, 224-225, 375 
motivation 187—189 
representation theory 211-212, 223 
(see also affine algebra) 
Kac—Walton formula 369 
Kahler class 432 
Killing form «(x|y) 60-61, 64—65, 69-71, 86, 205, 
211, 368 
Knizhnik—Zamolodchikov (KZ) connection 150, 
201, 292, 399 
Knizhnik—Zamolodchikov (KZ) equation 201-202, 
290-291 
monodromy 27, 200, 202, 291, 421 
and Virasoro 185-186, 287, 292, 421 
(see also chiral block) 
knot 
ambient isotopic 41 
and braids (see braid groups and links) 
crossing number 42 
definition 41 
group (see group, knot) 
higher dimensional 41—42 
invariant (see link, invariant) 
(see also link) 
Krichever—-Novikov algebra 216-217 
Kronecker—Weber Theorem 101—102, 158 


L(G), L?(G/T) etc. 21, 51, 84, 136-137, 155 
labels a; 170, 172, 190-191, 205, 431 
Lagrangian 229-231, 235, 251, 270 
Lagrangian density 238-240, 255, 257-258, 261, 
263-264, 266, 271, 278, 280 

Langlands programme 137, 142, 158 
Laplacian 133, 148-149, 156, 242 
lattice 

automorphisms 32, 43-44, 409 

co-root 195-196 

definition of 6, 29-30 

determinant |L| 31, 328 

dimension 30, 328 

direct sum of (see direct sum, of lattices) 


dual L* 30, 134-135 
equivalent 30 
even 30, 135, 328 
indefinite 30, 214, 224, 328 
integral 30 
laminated 31-32 
positive-definite 30, 328 
root 31-32, 70-71, 169, 188 
self-dual 30, 140, 168, 333, 402 
and string theory (see string theory and lattices) 
and tori 32-33, 56 
VOA V(L) (see vertex operator algebra, lattice) 
weight 72-73 
(see also Leech lattice, theta series) 
Laurent polynomial 27, 42, 184, 187, 189, 206, 216 
Lax—Phillips scattering theory 154 
Lebesgue integral 46 
Lebesgue measure 45—46 
Leech lattice A 
automorphism group C 09 (see Conway groups) 
definition 31-32 
and genus-0 property 428 
and McKay-—Thompson series 408—409 
and Moonshine module (see Moonshine module, 
construction) 
theta series 7, 135, 294, 415 
uniqueness 32, 414—415 
level 
of affine algebra module 192-194, 196, 207, 217, 
286, 326, 333 
critical 327, 331 
fractional 375 
Levi decomposition 60 
L-function 142, 157, 158, 394 
Lie algebra 
abelian 54, 69, 86 
automorphisms 78-80 
classification 54 
definition 7, 53 
free 212, 416 
geometric 349 
homomorphism 54, 59, 64, 66 
ideal of (see ideal) 
and lattices (see co-root) 
of observables 232, 240, 247—248 
orbit 81, 219-220 
radical of 60 
reductive 60, 197, 324 
self-dual 324 
semi-simple 60-61, 197, 226 
simple 60, 64, 196 
classification of finite-dimensional 61—63 
presentation 62 
structure 53, 68-71 
simply laced 169 
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solvable 60-61, 68, 83-84 
of vector fields 36, 55, 178, 184 
Lie derivative 36, 55, 217 
Lie group 
classification 59 
compact 60, 206 
complexification 300 
characters at elements of finite order 101, 288, 
368 
representations 82-85 
definition 55-56 
and Lie algebras 57-59, 64, 77-78, 206, 300 
and special functions 159 
and physics 231, 234-235, 254-255, 267-269, 
271, 286 
representations 44, 82, 85, 184 
structure 59 
line bundle 
definition 38 
determinant 185, 300 
sections 38, 84, 178 
link 
definition 41 
invariant 42, 94, 167, 265, 385 
(see also Jones polynomial) 
mirror image 42, 44, 400 
(see also knot) 
Liouville’s Theorem 116 
Littlewood-Richardson rule 370 
local conformal net 391 
local coordinates 34—35, 37, 185, 229, 322 
localisation functor 185 
locality (in physics) 237—238, 254, 318-320, 322, 
330 
loop (in quantum field theory) 278-279, 283 
loop algebra 178, 189-190, 192, 205-207, 215 
(see also affine algebra, nontwisted/twisted — 
construction) 
loop group 189, 194, 206-207, 312 
loop space LM 299, 352 
Lorentz group SOF | R) 56, 108, 234-235, 
254-255 


Maass form 156, 166 
Magic Square (see Freudenthal’s Magic Square) 
Magic Triangle (see Cvitanovi¢’s Magic Triangle) 
manifold 

ALE 204 

conformal 34 

definition 33-34 

invariants 306 

Riemannian 34, 37-38, 229, 236, 351, 353 

smooth vs topological structure 34, 56, 

296-297 
symplectic 85, 231, 232, 270 


mapping class group I, » 
and conformal field theory 185-186, 288, 291, 
302, 392 
(see also chiral block) 
definition 120-122 
enhanced T, n 125, 299, 420 
and Monstrous Moonshine 288, 291, 427 
projective representations of 125, 186, 203, 217, 
288, 291, 302, 392 
(see also braid group 63; braid group B,,) 
Markov move 43, 385, 388 
Mathieu groups 19, 402, 404, 409, 422, 426, 432 
and Leech lattice 20, 422 
Maxi-Moonshine 145, 292, 294, 424-426, 428 
McKay correspondence 171, 204—205, 374 
(see also A-D-E) 
McKay equation 3-4, 135, 402 
McKay-Thompson series T,(t) 5, 78, 95, 145, 218, 
220, 407-408 
and the Leech lattice 409 
linear dependencies 337, 407-408 
and modular equations 418 
and replicability 409-411, 415 
meromorphic function 114, 116-117 
metaplectic group Mp2(R) 163-164, 165-166, 167 
minimal model (CFT) 288 
minimal polynomial 96, 102 
Mini-Moonshine 422—424 
Mirror Moonshine 432—433 
mirror symmetry 204—205, 225, 351, 395, 432-433 
Möbius transformations 107—108, 126, 155, 186, 
290, 309, 327 
mode 308, 314, 317, 330, 334 
modular data 183, 196, 288, 342-343, 359-361, 
367-371, 376, 382, 390 
and Galois 358, 360, 371, 397-398, 400 
modular equation 417—419, 422 
modular fiction 411-412, 418-419, 422 
modular form 196 
definition 127—128, 155 
of fractional weight 127, 129, 132, 165, 421 
for SL2(Z) 343, 352 
Siegel 152-153, 155 
vector-valued 134-135, 343 
(see also automorphic form; Borcherds’ lift) 
modular function 
definition 2, 127 
for SL2(Z) 2-3 
vector-valued 196, 199 
(see also Hauptmodul) 
modular functor (Segal) 302-303, 307, 310, 322, 
349-350, 376, 386 
modular group SL2(Z) (see SL2(Z)) 
modular invariant 361—362, 373-374, 377, 383, 390 
(see also partition function) 
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Modular Moonshine 428-430 
modular representation 25—26, 28, 29, 428-429 
modular tower 158, 292, 397 
module 
completely reducible 22, 67 
direct sum 66 
indecomposable 23 
irreducible 22, 67 
simple (see irreducible above) 
submodule 67 
module for group 
contragredient 22, 25 
definition 20, 82 
direct sum 21—24, 84 
dual (see contragredient above) 
and representation 20 
tensor product 22, 24-25, 28 
unitary 22-23, 82-83 
from VOA 329, 331 
(see also character; module) 
module for Lie algebra 
for abelian g 86 
admissible (see level, fractional) 
contragredient 66 
definition 66 
derived 82-83, 86, 156, 161, 207 
direct sum 66-67, 76 
dual (see contragredient above) 
highest-weight 67, 74, 192-193 
integrable 194—196, 205, 368 
vs Lie group module 66, 77, 82-84, 207 
and representation 66 
for solvable g 68 
tensor product 66 
twisted 218—220 
unitary 66—67, 86, 194 
Verma 67, 70, 74-77, 192-193 
from VOA 324 
(see also affine algebra; character; module) 
module for vertex operator algebra 
in CFT 286, 289, 309 
characters (see characters; graded dimension; 
vertex operator algebra) 
contragredient 332 
definition 309, 330 
dual (see contragredient above) 
graded dimension (see graded dimension; vertex 
operator algebra) 
lowest-weight space Mp 332, 334-335 
twisted 220, 293-295, 345-348, 425 
unitary 332 
moduli space Ny » 
in CFT 124-125, 185-186, 281, 283, 285, 
290-292 
definition 119-121 


Deligne—Mumford compactification (see 
Deligne—Mumford compactification) 
enhanced Men 124-125, 185-186, 281, 287, 
291, 299 
in string theory 124, 278-280 
and Virasoro, Witt 84, 185—186, 281, 287, 292, 
301, 318, 348, 421 
momentum 231, 240, 242-243, 254, 256-257 
canonical (see generalised above) 
generalised 232, 238, 255 
monodromy representation 200-203, 204-205 
in CFT 11, 186, 290-291 
and KZ equation (see Knizhnik—Zamolodchikov 
equation) 
and modularity 200, 202-204, 291 
in Moonshine 11, 150 
Monster M 
as 6—transposition group 11, 166, 426-427, 431 
as Aut(X) 121 
centralisers in 403-404, 414, 423, 425, 429-431 
character table 403, 405, 407 
conjugacy classes 5, 403, 405 
history 19-20, 403 
representations 4—5, 23-24, 177, 405 
size (order) 4 
and other sporadics 20, 403, 430 
and V" 9, 414 
Monster Lie algebra m 214 
construction 415—416 
denominator identity 222, 223, 225, 415—416 
and Monster 415—416 
Monstrous Moonshine 
conjectures 5, 8, 407 
and physics 9-10, 303, 433-434 
Moonshine module V4 328 
automorphism group 406, 414, 426 
construction 294, 347, 414 
graded dimension J (t) 294, 343, 407 
and Griess algebra 325, 414 
invariant bilinear form 415 
twisted modules 220, 425, 428, 430, 434 
moonshine-type 138—140, 407, 411, 423, 425 
morphism (see arrow) 
Mostow Rigidity Theorem 110, 122 
M-theory 212, 280, 375 
multiplier (for modular forms) 127, 131—132, 134, 
165, 343 
mutually local 272-273, 316, 322 


Nahm’s Conjecture 372 

NIM-rep 361, 374-375, 383, 390-391 

Noether’s Theorem 23 1—232, 240, 256, 259, 268, 
284 

No-Ghost Theorem 416 

non-commutative geometry 117, 265, 271, 380 


non-orientable surface 111 


normal-ordering 181, 198—199, 258, 317, 335-336, 


371 
Norton series N(g,n)(t) 145, 425, 428, 432 
Norton’s Conjecture (see Maxi-Moonshine) 
n-point function (see correlation function) 
n-torus 33, 84, 123, 205, 215, 250, 393-394 
null vector 67, 76, 216, 290, 308, 327, 330 


objects in category 87 
octonions 53, 56, 61 
operator product expansion (OPE) 266, 283-285, 
301, 317, 326-327, 357 
orbifold 204, 292 
and CFT 293-295 
holomorphic 292-294, 347, 382, 384, 392 
and Maxi-moonshine 292, 294, 295, 425-426 
permutation 295, 426 
and strings 292-293 
and VOAs 328, 345-348 
(see also module from VOA Vr, twisted) 
orbit method 85, 137, 184-185, 207 
orbital integral 137 
order ||G|| of group 15 
oscillator algebra tı 197-198, 223, 325-326, 377 


p-adic numbers 58, 157—158, 292, 396, 399-400 

pair-of-pants 125, 279, 297, 301, 307, 322-323, 
350, 363 

paragroup 389-390 

pariahs 424 

particle 154, 237, 254, 256-259, 273 


partition function 260, 283, 287, 289, 301-302, 314, 


353, 361, 373 
(see also modular invariant) 
path integral 250-251, 266, 279-280, 300 
(see also Feynman path formulation) 
pentagon axiom 89-90, 398-399 
Perron—Frobenius theory 172-173, 175, 357, 374, 
387, 389, 391 
Peter—-Weyl Theorem 82, 84—85 
phase space 119, 232, 242, 248 
Picard’s Theorem 116 
Planck’s constant fi 242 
Poincaré group 17, 56, 179, 234, 247, 250, 254, 
267-268, 272 
Poincaré—Birkhoff—Witt Theorem 74—75, 182 
Pointrjagin duality 136 
Poisson bracket 231, 232, 238, 247, 255 
Poisson summation formula 131, 135-138, 140 
and theta function modularity 8, 104, 131, 134, 
137-138, 143, 152 
positive energy representations 182, 207, 300 
potential V 227—229, 242 
p-regular element 428 
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primaries (sectors) 286-287, 356-357, 360, 386 
(see also state, primary) 
principal gradation 193—194 
profinite completion G 396-400 
projective geometry 38-39, 406 
projective limit 157—158, 271, 291-292, 392, 
395-396 
projective n-space P” (R), P” (C) 39, 116-117, 353, 
370, 393, 431-432 
projective representation 83, 176—179, 186, 207, 
218, 345 
and central extensions (see central extensions) 
and CFT 125, 186, 285, 291, 295, 296, 299-302 
projectively equivalent 177 
and quantum theories 177, 242, 247, 254, 272 
and two-cocycle 177-178 


quadratic Casimir 86, 156, 199, 201, 288, 327 
quantisation 247—248, 255, 259 
(see also geometric quantisation) 
quantum cohomology (see Gromov—Witten 
invariants) 
quantum dimension 101, 337, 357, 381 
quantum double 357, 380, 382, 390 
quantum electrodynamics (QED) 255, 261, 262, 
268-269 
quantum field 117, 237, 253-254, 272, 282-283, 
312-313, 322 
quantum field theory 117, 226, 252-253 
axiomatisations 271-275, 390 
mathematical difficulties 167, 257—258, 262-264, 
265-266, 270-271 
nonperturbative effects/calculations 262, 265, 
280 
nonrenormalisable 263-264, 277 
and number theory 154, 264—265, 395, 400-401 
particles (quanta) 256-259, 273 
perturbation 205, 259-262, 279 
renormalisable 253 
quantum group 75, 125, 202, 378-381, 386 
at root of unity 381, 386 
quantum mechanics 
Feymnan’s formalism 250-251 
Heisenberg’s formalism 247—249 
identical subsystems 249-250 
measurement problem 243-246 
perturbation 251-252 
probability 241, 242 
Schrédinger’s formalism 241—242, 246-247 
quantum Schubert calculus 370 
quasi-periodic 143, 151, 159 
quasi-primary (see state, quasi-primary) 
quasi-symmetric homeomorphism 185 
quaternions 17, 24, 53, 56, 61, 64, 351 
quilt 427 
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Racah coefficients (or 6j-symbols) 365 
Racah-Speiser formula 369 
Ramanujan t-function 142, 221 
rational conformal field theory (see conformal field 
theory, rational) 
ray representation (see projective representation) 
regular representation (see group algebra; L7(G)) 
regularisation 
and Lie theory 198—199, 270, 371 
and number theory 118, 133, 270 
in quantum field theory 133, 263-264, 270-271 
zeta-function 133, 140, 199 
Reidemeister moves 42, 383, 391 
relativity 
general 226, 233, 236, 239, 263, 267—268, 269, 
277 
special 233-236, 237, 238, 246, 247, 250, 252, 
254, 272 
renormalisation 260, 262—265, 270-271 
replicable functions 410-412, 415-416, 422, 426 
and the power map g” 409 
representation (see module) 
p-shift (Lie algebras) 217, 371 
ribbon 93 
(see also category, Ribbon) 
Riemann sphere 2, 39, 138, 285 
Riemann surface 87, 114-117, 216-217 
in CFT 264, 277, 281-282, 286, 298 
and conformal structure 114, 281, 290 
Riemann zeta function ¢(s) 127—128, 133, 140-142, 
154, 168-169, 199 
Riemann-Hilbert problem 203 
R-matrix universal (see universal R-matrix) 
root 68, 73 
highest 193 
imaginary 68, 191, 193, 211, 212, 214, 224-225, 
415416 
lattice (see lattice, root) 
positive 70, 73-75, 77, 193, 220, 415-416 
real 191, 193, 211, 214, 224, 415-416 
simple 69-70, 72, 79, 187, 190, 193, 214, 
415—416 
space 69-70, 415—416 
space decomposition 69, 86, 191, 210, 213, 
415—416 
system 69-70, 72 


scale invariance 230, 284-285, 297 

scattering matrix (see S-matrix) 

Schur multiplier 177—179 

Schur polynomial 72 

Schur’s Lemma 23, 67, 80, 161, 192, 216, 331 

Schrédinger’s equation 242, 243, 245, 246, 247, 
253, 275-276 

Schwartz space S(R”) 45—49, 131, 136, 162, 241, 
253 
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Schwarzian derivative 340 
sector 
superselection 242, 296 
twisted 293-295, 392 
Segal’s axioms (CFT) 298-303, 305, 310, 348-350 
Selberg trace formula 137, 154 
self-adjoint operator 47, 49, 52, 242, 247, 275, 
354-355 
semi-direct product 16-17, 18, 28-29, 56, 85, 195 
semi-direct sum 53, 59—60, 406 
sewing 293, 299-302, 306-307, 310, 339, 341, 
363-364, 397 
sheaf 34-35, 351-352 
Siegel upper half-space Hy 122-123, 151-152, 155 
simple factor (in geometry) 393-394 
simple-current 357—358, 360, 362, 367, 369, 373, 
391 
simply connected 40, 43, 83, 115, 184, 200, 202, 
206-207 
singular vector (see null vector) 
singularity 
blowing up 117 
minimal resolution 171, 204-205, 402 
quotient (orbifold, conical) 116, 120-121, 170, 
204, 292, 374, 402 
resolution 171 
simple 170-171, 204 
(see also A-D-E; McKay corespondence) 
SL2(R) 86, 107, 109-110, 125, 160 
and Lorentz group 108, 154 
and modular forms 154—157, 223 
representations of 86, 155-156 
universal cover 11, 164-165, 167, 184 
SL2(Z) 109-110, 116, 126, 152 
in CFT 205, 288, 291, 293 
representations of 177, 293, 342, 359 
(see also modular data) 
and tori 10, 120-121, 130, 280 
(see also braid group 63; modular form; modular 
function) 
S-matrix 154, 252, 259, 265, 281 
space-time 34, 229, 265, 277, 280-281, 292, 297, 
301 
Minkowski 29, 56, 233-234, 236, 239, 268, 274, 
281-282 
Spectral Theorem 48—49, 242-243, 273 
speed c of light 233-235, 237, 255, 282 
sphere S” 40, 42, 53, 55-56 
sphere-packing 31-32 
spin 83, 250, 254-255, 296, 352 
spinor 83 
sporadic group (see finite simple group) 
squirrel 12, 59, 329, 435 
Standard Model 255-256, 262-264, 266, 269, 273, 
277-278, 280 
star-triangle relation (see Yang—Baxter equation) 
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state 
BPS 205, 214 
incoming 258-259, 266, 273, 278-279, 283, 
300-301 
outgoing 258-259, 266, 273, 278-279, 
300-301 
primary 285-286, 287, 323, 325, 340 
quasi-primary 285, 366 
state space H 249, 253-254, 272, 275, 286, 
298-299, 322, 361 
state-field correspondence 283, 318 
statistical mechanics 266, 281 
Stone—von Neumann Theorem 161, 163, 265 
stress-energy tensor 239-240, 284—285, 288, 292, 
301, 322 
string theory 277-280 
and CFT 9, 252, 276-277, 279, 281, 289, 300 
and lattices 10, 280, 293-294, 360 
(see also vertex operator algebra, lattice) 
and Lie groups (see Wess—Zumino—Witten model) 
and modular forms 9—10, 166, 264, 279-280, 408 
and moduli space (see moduli space in string 
theory) 
and Monstrous Moonshine 280, 433-434 
perturbative 9, 252, 264, 276, 278-279, 433 
subfactor 
basic construction 387-388 
and braids 27, 125, 167, 202, 288, 388 
and CFT 44, 374, 386, 388-391 
definition 49, 386 
analogy with Galois 386, 389 
and knots 49, 386-388 
and orbifolds 391, 426 
subgroup 
commutator [GG] 29 
index (see index, subgroup) 
normal 14-17, 24, 59 
Sugawara construction 199, 217, 324, 326 
superposition 243—245 
supersymmetry (see conformal field theory, super; 
vertex operator superalgebra) 
surface 33 
enhanced 124—125, 185, 281, 283, 287, 291, 300, 
349 
K3 (see K3 surface) 
with nodes 124, 280, 291, 292 
(see also cusp; Deligne-Mumford 
compactification) 
Riemann (see Riemann surface) 
stable (see surface with nodes) 
symmetric group S, 17, 20-24, 93, 100, 187-188, 
250, 296 
as Weyl group 71, 78, 220 
(see also braid groups) 


Tamanoi’s invariant 351, 353 
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tangent 
bundle T M 35, 38, 53, 56 
space T,(M) 35-38, 56-59 
vector 35-38 
Taniyama—Shimura conjecture 130, 142 
Tannaka—Krein duality 90, 136, 328 
Teichmüller space Tg „n 120-122, 291 
universal 185 
theta function 
Jacobi 118-119, 131, 142-143, 160, 203, 
223-224 
modularity 8, 104, 131, 138, 143, 147-148, 
163-164, 280 
Siegel’s 123, 151 
theta series, lattice 7, 134—135, 138, 140, 143, 
195-196, 280, 294 
Thompson trick 4—5, 24, 80, 307, 346, 424, 425-426 
topological field theory 100, 303, 305-307, 363, 
388, 421 
and conformal field theory 287, 305, 307 
topological span 46, 241, 253 
toroidal algebra 215-216 
torus S! x S! 
and CFT 287, 292-293, 301-302 
conformal structures on 44, 120, 301, 341 
diffeomorphisms 44 
and elliptic curves 110, 112, 118, 123-124 
fundamental group 40, 113 
(see also elliptic curve; n-torus; SL2(Z)) 
tower 157, 158, 291, 397 
trace 
in CFT 241, 293, 300 
as character 5, 23, 25, 77, 183 
and determinant 58 
and sewing 341 
in von Neumann algebras 50-52 
trefoil 41-42, 164-166, 383-384, 391 
triangle axiom 89-90, 398-399 
triangular decomposition (Lie algebra) 70, 182, 
191-192, 210, 213 
Turaev—Viro theory 385, 389 
twenty-four 168-169, 198, 394, 402, 414, 416, 419, 
425 
(see also c/24) 
twining character (see character, twisted) 


uniformisation 3, 115—116 

unitary dual G 82-86, 136-137 

unitary operator 22, 47—49, 247, 254 

unitary representation (see Lie algebra; module for 
group; VOA) 

universal cover 115—116, 200-201 

group G 59, 83, 165, 178, 234, 254, 291 

universal enveloping algebra U (g) 74-75, 156-157, 
182, 198-199, 327, 334, 378-380 

universal R-matrix 380 
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unknot 41 
upper half-plane H 2, 105, 110, 120, 127, 154 
(see also hyperbolic plane) 


V (see Moonshine module) 
vacuum (sector) 194, 302, 325, 342, 356, 360-361, 
368, 382, 390 
vacuum |0) (state) 248, 253, 256-257, 259, 
272-273, 280, 301, 318-319, 322-323 
vacuum-to-vacuum expectation value (see 
correlation function) 
Vandermonde matrix 221 
variety 94, 351, 393-395, 396 
Vect(M) 36, 55, 59, 178, 184 
Vect(S!) (see Witt algebra) 
vector bundle 
base of 38 
connection on 38 
definition 38 
fibre of 38 
G-equivariant 382 
section of 38, 134, 291 
vector field 35-36, 53, 55, 57, 178, 184, 188 
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and lattices 32, 70, 169 
and weights 70, 73-74, 78, 195 
Weyl reflection 70 
Weyl vector p 77, 195, 368, 371 
Weyl—Kac character formula 196, 211, 221 
Weyl—Kac-Borcherds character formula 214, 220 
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